Recursivity and Focus in the Prosody of Xitsonga DPs

: This paper explores the prosodic patterns of complex DP structures in Xitsonga by looking at penultimate lengthening in DPs with marked and unmarked word orders of different types. We discuss the underlying syntactic structures and prosodic realizations of Xitsonga DPs. We are particularly interested in the way in which recursion applies in the Xitsonga DP, where it surfaces in DPs with multiple modiﬁers of the same or different categories that appear in fronted (i


Introduction
Adjectives are among the syntactic categories that can recur multiple times in a single noun phrase (DP), limited mostly by semantic constraints on meaningful interpretations. In this paper, we show that in the Bantu language Xitsonga, adjectives and other types of nominal modifiers can not only be merged recursively, but that movement can take place multiple times, resulting in completely free word order in the Xitsonga DP. However, closer examination reveals that while the word order is free in Xitsonga DPs, the processes creating these orders are constrained. We argue that only one element can move to a focus position immediately above DP, which we call Focus Phrase (FocP), even while multiple DP modifiers can appear before the noun in this Bantu language. Prosodically, this means that recursive phonological phrases are created in complex DPs.
Xitsonga (Guthrie code: S53) 1 is a Bantu language spoken by about 5.5 million first language speakers in South Africa, Mozambique and Zimbabwe. While there is no literature on the syntax of the DP in Xitsonga, Xitsonga grammar and syntax have been described in Baumbach (1987); du Plessis et al. (1995). The general tonology and prosody of Xitsonga have been described in Kisseberth (1994); Lee and Selkirk (2022); Selkirk (2011);Zerbian (2007). A study of penultimate lengthening patterns with single modifier DPs with focus movement in Xitsonga is Lee and Riedel (2021). There are several varieties of Xitsonga, spoken in different regions. Xitsonga is part of the Southern Bantu languages group (Zone S), where the different varieties form a subgroup of their own.
There is evidence that Southern Bantu DPs in general, not only Xitsonga DPs, differ significantly in their ordering restrictions from the types of ordering restrictions shown by other Bantu languages. For most of Southern Bantu, literature such as du Plessis and Visser (1992) on isiXhosa, Sabelo (1990) on isiZulu, Mokoaleli et al. (2021) on Sesotho (as well as our own data on Sesotho) and Creissels (2014) for Tswana, show that many or even any nominal modifier, including relatives (du Plessis and Visser 1992, p. 389), possessives and associative phrases, can appear before the noun. This means that, like Xitsonga, most if not all Southern Bantu languages allow a wide range of nominal modifiers to appear before the noun.
The paper is structured as follows: We first present a brief background on Bantu DPs before introducing Xitsonga DPs. Section 2 gives an overview of prosodic phrasing in Xitsonga. Section 3 discusses the DP in Xitsonga and the prosodic reflexes of word order variation. Section 4 offers our analysis of these patterns and Section 5 offers our conclusions.

Background: Bantu DP Structure
Xitsonga's tolerance of free word order in the DP is unusual in Bantu, where flexibility in the post-nominal position has been frequently noted (Rugemalira 2007) but normally only demonstratives can also appear in pre-nominal position. Here we provide a brief overview of relevant word order patterns applying to Bantu DPs.
Bantu DPs are typically noun-initial. Bantu languages have a number of different types of nominal modifiers including demonstratives, possessives, associatives (genitives), adjectives, numerals, quantifiers and arguably also question words such as 'which' (see Katamba 2003 for a general overview of Bantu nominal morphology, van de Velde 2019 for an overview of nominal syntax and Bearth 2003 for general syntactic properties). Demonstratives may appear before or immediately after the noun or in final position of a DP in unmarked order, which differs depending on the language (cf. van de Velde 2019). Van  Bantu languages typically have very small adjective inventories and for some Bantu languages it has been argued that apparent adjectives are relative clauses. Across Bantu, numerals tend to behave categorically like adjectives and also agree with their head noun. Generally only some numerals agree in Bantu languages. These tend to be based on stems inherited from Proto-Bantu and typically represent the numbers lower than 6 or 10, while higher numbers are expressed by phrases and are often based on nouns rather than adjectives. In Southern Bantu languages, speakers typically use numbers borrowed from English in certain contexts (such as when referring to dates, time or phone numbers), but for small numbers the non-borrowed agreeing forms (see 7 below) remain the dominant pattern. As (1a,b) in contrast to (1c,d) show, there may be different unmarked orders for numerals and adjectives in different Bantu languages. Rugemalira (2007) shows that in the post-nominal position different orders of adjectives (a category which for him includes agreeing quantifiers such as 'many'), numerals, ordinals, other quantifiers, associative phrases and relatives are possible in a number of Tanzanian Bantu languages, with only demonstratives, possessives and interrogatives that modify noun phrases having fairly fixed positions.
In Bantu languages, quantifiers may agree or not agree with the head noun, depending on the quantifier type and language (for a general overview of quantifiers in Bantu cf. Zerbian and Krifka 2008). Typically quantifiers are not flexible with respect to their position before or after the noun. For example, in many Bantu languages certain quantifiers must be pre-nominal and unlike other modifiers, these quantifiers do not agree with the number or gender of the noun (2). However, in some Bantu languages quantifiers that normally appear after the noun can also appear prenominally (cf. Rutooro, Clemens andBickmore 2021 andHa, Harjula 2004).
(2) Examples from (Rugemalira 2007, p. 138 There is very limited available data on pre-nominal modifiers other than demonstratives and quantifiers beyond Southern Bantu but possessives have been reported to be allowable in pre-nominal positions in some Bantu languages that are not part of the Southern Bantu group. Van de Velde (2019, p. 260) notes optional pre-nominal possessives in Makaa (A83) which receive a focus reading, similar to the structures discussed here.
Having established that word order freedom in DP generally does not apply to the pre-nominal domain in Bantu languages, we show below that in Xitsonga any nominal modifiers are allowed to appear in pre-nominal position (see Section 3). Carstens (2008) proposes the structure in (4) for the DP in Swahili, representing the noun phrase in (3) where the noun raises to D while the demonstrative is generated in a high position in the specifier of an XP immediately below DP. Swahili only allows demonstratives and the quantifier kila 'each, every' (2b) to appear before the noun. Carstens (2008, p. 155) discusses movements deriving variable post-nominal orders in the Swahili DP. She argues that the merge order is Dem-Num-Adj-N (which is not an allowable surface order in Swahili) but that Dem-N-Adj-Num, N-Dem-Adj-Num, N-Dem-Num-Adj and Dem-N-Num-Adj are all permissible for a noun phrase such as (3), meaning that the demonstrative can appear before or after the noun and the adjective and numeral can appear in either order.
(3) Wa-tu 2-person wa-le 2-DEM wa-wili 2-two wa-zuri 2-good 'those two good people' [Swahili, G42, Carstens 2008, p. 155] In Carsten's model, illustrated in (4), the order Dem-N is derived by the DemP moving from SpecXP to SpecDP, with N-Dem not involving this movement, while the noun always raises to D. The order Adj-Num is derived by allowing left and right adjunction of specifiers in the various functional projections of the DP. Carsten rejects accounts which allow for the N to raise to an intermediate position. Clemens and Bickmore (2021) discuss DPs in Rutooro (JE12), where normally nominal modifiers are post-nominal and flexible in their relative orders, except for possessives which must immediately follow the noun. However, one class of nominal modifiers can appear prenominally in Rutooro. This class includes the quantifier 'all' and demonstratives (Clemens and Bickmore 2021). As shown in (5), modifiers of this kind are freely ordered with respect to the noun, as in the Southern Bantu data we discuss here. 3 Clemens and Bickmore (2021, p. 815) generally, the structure from Carstens (2008) illustrated in (4), and derive demonstratives and type 2 modifiers via left adjunction to DP (for modifiers in pre-nominal position), as shown in (6) for a pre-nominal demonstrative (representing the noun phrase in (5c)). Nominal modifiers in DP-final position are right-adjoined to DP.
(5) a. E-bi-bíra Our proposed structure for Xitsonga and other Southern Bantu languages will build on (6) but introduce some additional movement to account for the variation observed.

Xitsonga DPs
Xitsonga shows typical nominal morphosyntax for a Bantu language. There are 14 noun classes and the different types of modifiers such as demonstratives, adjectives, numerals and other quantifiers show agreement with the head noun. Xitsonga DPs are head-initial, all classes of nominal modifiers follow the head noun in unmarked word order, as in (7a). All elements in the DP show flexible word order, even allowing fronting to prenominal position, where a fronted element is interpreted as marked for focus. Phonetically, the fronted modifier mambirhi 'two' in (7b) has a longer penultimate syllable, whereas the head noun ma-sangu '6-sleeping mat' in (7a) in the same position does not display such lengthening when it appears in non-final position. In the examples below, H tone is marked with an acute accent and vowels that sponsor an H tone are underlined.
(7) a. Ma-sangu 6-sleeping.mat ma-mbirhí 6-two ma-ntsó:ngó 6-small 'two small sleeping mats' (unmarked word order: N Num Adj) b. Ma-mbi:rhí 6-two má-sángu 6-sleeping.mat ma-ntsó:ngó 6-small 'TWO small sleeping mats (neither 1 nor 3)' (marked word order: Num N Adj) The structures representing (7a,b) are shown in (8) and (9). These structures essentially follow Clemens and Bickmore (2021) but use a general focus phrase (FocP) in place of DemP that different types of nominal modifiers can move to (9). The unmarked word order with the noun in DP-initial position (7a) is represented in (8) and the marked order with the focused numeral (7b) in (9). The unmarked DP structure in (7a) features penultimate lengthening in the final prosodic word. When one of the modifiers is fronted as in (7b), the fronted modifier also displays penultimate lengthening. The penultimate lengthening of the final prosodic word in both examples is due to a higher prosodic phrase boundary (see Selkirk 2019;Kisseberth 1994). The penultimate lengthening in the fronted modifier is puzzling. Xitsonga shows penultimate lengthening before an intonation phrase, but a single prosodic word is not an intonational phrase. As such, the modifier with the penultimate lengthening must have a different source (cf. Kanerva 1990). Xitsonga could be a language that uses phrase-based cues for focus marking (cf. Kügler and Calhoun 2020).
Before proposing a possible solution to this puzzle, we discuss our basic assumptions on the indirect relationship between syntactic constituents and prosodic structures. This paper uses the Match Theory proposed in Lee and Selkirk (2022), which assumes that Match is a spell-out constraint between the morphosyntactic output (MSO) and the phonological input (PI). This spell-out constraint allows recursive prosodic categories as part of the representation, departing from theories of prosodic structure that assume the Strict Layer Hypothesis (Elfner 2018;Selkirk 1986). The prosodic structures in the phonological output (PO) are formed following language-specific ranking of prosodic constraints, unrelated to MSO. Based on these assumptions, we understand the prosodic structures of the sentences in (7) to be as schematized in (10) with recursive phonological phrases in the PI. The phonological phrases in PO in (10) follow the prosodic well-formedness constraints in (12) that are discussed in Section 2. The x in (10b) stands for a prosodic category that needs to be determined and which corresponds to the FocP in the morphosyntax.
(10) a. Unmarked order MSO: PI: In example (a) of (10), there is a mismatch between the morphosyntactic output and the prosodic structure in the phonological output. In the morphosyntactic output, the noun and its modifiers form single-word phonological phrases, which are grouped with the verb into a phonological phrase, whereas the noun is grouped with the preceding verb in the prosodic structure. The phonological input shows a matching prosodic structure. 4 The phonological output of the unmarked DP structure corresponding to (7a) has a recursive phonological phrase φ; the head noun and the verb form φ5 and the first modifier forms φ6 with the second modifier. The verb phrase itself also forms a phonological phrase φ4. In such a prosodic structure (10a) only the final prosodic word (i.e., Adj) shows penultimate lengthening. The prosodic structure in the phonological output in example (10b) also shows a mismatch with the morphosyntactic output. In the prosodic structure, the fronted Adj forms φ10 with the preceding verb. The recursive prosodic structure of (10b) differs from (10a) in that the fronted Adj forms its own phonological phrase and the noun and the Num form another phonological phrase φ11. The verb and the fronted Adj correspond to φ10 and the verb phrase is φ9. Penultimate lengthening is observed in the fronted Adj and the sentence-final Num.
We propose that a fronted element is a focus-marked phonological phrase, which corresponds to the FocP phrase in the syntax. The element x in the prosodic representation in example (10b) is the prosodic realization of a focus phrase. This φ will be indicated as φ-FOC in the rest of the paper. The idea that prosodic structure reflects an inherent syntactic-semantic feature is adopted from Kratzer and Selkirk (2020), who show and argue that in Standard American English and British English a given-marked constituent is not mapped to a phonological phrase in the phonological input. For Xitsonga, we assume that the fronted modifier in Xitsonga is placed in the FocP and the contrastive nature of focus results in a realization of φ-FOC with penultimate lengthening. In Kratzer and Selkirk (2020) and Lee and Selkirk (2022), it is proposed that the phonological input (PI) includes prosodic structure that matches or does not match the morphosyntactic output. The restructuring of prosodic structure is due to constraint interactions between PI and the phonological output PO. In Section 2, the architecture of this theory will be introduced in detail.
Mirroring syntactic structures, prosodic structures are argued to be recursive in Bennett (2018); Mester (2009, 2012); Kubozono (1989); Ladd (1986); Selkirk (2011); Selkirk and Lee (2015); Wagner (2005). For prosodic recursion, see Elfner (2012Elfner ( , 2015; Ito and Mester (2013); Myrberg (2013) and the works cited there. Even so, the flexible nature of the word order in Xitsonga DPs is puzzling because our native speaker consultant judges all permutations of a DP to be grammatical (see Section 3). DPs with three modifiers (24 types) or four modifiers (120 types) allow much more flexibility than has been reported for most other Bantu languages. We suggest that the flexible word order is driven by prosodic requirements: (a) a focused element appears in FocP, which then maps into a prosodic phrase φ-FOC, and by (b) other scrambling movement happens in the phonology only. As we show in Section 3, words that undergo scrambling movement do not show penultimate lengthening, nor are they interpreted as being in focus. While we demonstrate that scrambling movement is possible in Xitsonga, the exact nature of this type of movement in Xitsonga is yet to be uncovered; a future study focusing on mapping between the FocP in the morphosyntactic output and φ-FOC in the phonological input will shed more light on this.

Prosodic Phrasing in Xitsonga
Xitsonga, like the vast majority of Bantu languages, is a tone language that also makes extensive use of penultimate lengthening to mark phrase boundaries (Kisseberth 1994). High (H) tone spreading and lengthening of penultimate vowels are important markers of phonological phrase boundaries. In Xitsonga, H tone spreading is unbounded and this pattern is forced by the markedness constraint on H tone spreading shown in (12a). In (11a), a H tone originates from the subject prefix váand spreads to the penultimate syllable of the object. The examples in this section show that penultimate lengthening is an indicator of a higher phrase boundary. H tones do not spread onto the final syllable of a constituent (which may be a phrase at a relevant boundary or the entire sentence) due to the NON-FINALITY constraint (12b). In example (11b) with all toneless words, no H tone appears in the surface representation. Following Lee and Selkirk (2022), we assume the phonological input (PI) has prosodic structures that mirror the morphosyntactic output. In Xitsonga, a phonological phrase is required to have two prosodic words; a phonological phrase with a single prosodic word undergoes deletion of the phonological phrase (see PO of 11a). The deletion of φ (12d) of non-binary φ's is forced by the markedness constraint that requires two prosodic constituents within a φ as in (12c). To make it easier to read the data, we have indicated prosodic boundaries with round brackets in a separate line below the gloss. In subsequent examples, prosodic word (ω) boundaries are omitted.
(11) a. Vá-xává SM2-buy nyá:ma 9.meat PI: 'They buy meat' (originally from Kisseberth 1994, p. 142) b. Ni-xava SM1S-buy nya:ma 9.meat PI: The target for penultimate lengthening (PL) and H tone spreading is identical in (11a), but this is not always the case in Xitsonga. One such mismatch between PL and HTS is found in (13), where the H tone spreads from the subject prefix (13a) or the verb (13b) to the noun class prefix of the object but not beyond when the noun has a toneless syllable followed by a H-toned syllable. The penultimate syllable is lengthened but remains toneless. The blocking of H tone spreading onto the toneless penultimate syllable is due to the Obligatory Contour Principle (14) (cf. Goldsmith 1976;Leben 1973;Lee 2009).
(13) a. Vá-xává SM2-buy má-ta:ndzá 6-egg PO: ι ( φ (vá-xává má-ta:↓ndzá) φ ) ι 'They buy eggs' (originally from Kisseberth 1994, p. 144) b. Ni-vóná SM1S-see vá-la:lá 2.enemy PO: ι ( φ (ni-vóná vá-la:↓lá) φ ) ι 'I see enemies' (originally from Kisseberth 1994, p. 144) (14) OCP(ω, H) Assign a violation mark to two H tone spans that are syllable-wise adjacent within a prosodic word ω (Lee and Selkirk 2022, p. 353) Another example of the mismatch between the landing site of H tone spreading and penultimate lengthening is found in ditransitive constructions, as shown in (15), where the H tone spreads to the penultimate syllable of the first object, but not beyond. In (15), penultimate lengthening does not appear on the first object but only on the second object. In PI, each XP (corresponding to the two object NPs and the VP) is a phonological phrase. The deletion of phonological phrases encompassing the first and the second objects, which match the morphosyntactic output and the insertion of a phonological phrase including the verb and the first object, which violates DEP(φ,H) in (16b), are enforced by the BINARITY constraint which requires a phonological phrase and by the STRONGSTART constraint defined in (16a).
Example (15) shows that the domain for H tone spreading is the phonological phrase which groups the verb and the first object to the exclusion of the second object (Lee and Selkirk 2022), but the domain of penultimate lengthening is different. If the domain of the penultimate lengthening were any phonological phrase, we would expect to observe penultimate lengthening in the first object as well as in the second object. What we propose in Section 4.1 is that the penultimate lengthening of the second object fo:le is because it targets the φ-max. This mismatch further suggests that the penultimate lengthening we observe in the pre-nominal adjective ma-ntsó:ngó 'small' in (7b) could be due to a φ-max boundary that matches the FocP in the syntax.
(15) a. Vá-xávélá SM2-buy.for xí-phúkúphúku 7-fool fo:le 5.tobacco 'They buy tobacco for a fool' (originally from Kisseberth 1994, p. 148) b. Ni-nyíká SM1SG-give mú-nhu 1-person nya:ma 9.meat PI: 'I give someone meat' (originally from Kisseberth 1994, p. 148) (16) a. STRONGSTART(φ) Assign a violation mark when a prosodic constituent φ begins with a leftmost daughter constituent π n which is lower in the prosodic hierarchy than the constituent π n+1 that immediately follows. H tone spreading patterns show dialectal differences as well. In Mozambican Xitsonga (Kisseberth 1994, p. 148), a H tone spreads to the penultimate syllable of the first object, but not beyond, as shown in (17b), which shows that NON-FINALITY is at work at the phonological phrase level. In a dialect from Mhinga area, however, H tone may spread until the final syllable of a phonological phrase (17a). Although the application of NON-FINALITY at different levels displays different H tone spreading patterns, the penultimate lengthening pattern holds in both dialects. Only the penultimate syllable before a higher prosodic phrase boundary is lengthened.

Recursion and Flexible Word Order in Xitsonga
The data we present in this section come from a Xitsonga-speaking trained linguist in his late 40s from Giyani (Limpopo, South Africa). DPs with two modifiers in unmarked order (nominals followed by modifiers) were created by him. The order of the modifiers was then varied by the first author to create a scrambled order of DP-internal elements. The speaker judged that the word order of all the sentences was acceptable, with remarks that marked word order displays focus effects on modifiers in the non-canonical position(s). In complex DPs, with a quantifier and an adjective in (18) and a numeral and an adjective in (21) modifying the noun (which here appears as the object), the Xitsonga consultant allowed all possible orders of modifiers, including for all of the nominal modifiers to appear in pre-nominal position (18e,f). The data in the rest of the paper are based on the recording of the list by the consultant.
In (18), the complex DP has a head noun followed by an adjective (a lexical word) and a quantifier (a functional word). The final prosodic word in each example shows penultimate lengthening as it appears before a higher prosodic phrase boundary. In the unmarked order in a complex DP, as shown in (18a), the last prosodic word hínkwá-wo (the quantifier) occurs with penultimate lengthening. When the quantifier (18b) precedes the adjective in the post-nominal position, both the quantifier and the adjective display penultimate lengthening. When the adjective precedes the head noun and the quantifier as in (18c,d), the adjective is focused, displaying penultimate lengthening. Although not being part of the penultimate lengthening domain, the element immediately following the focus phrase is also focused according to our consultant. In (18c), both the adjective and the head noun are focused, while the adjective and the quantifier are focused in (18d). In (18e) and (18f), the quantifier that appears in the DP-initial position shows penultimate lengthening. Item (18e) has focus interpretation on the quantifier and the head noun, whereas all DP elements are focused when the DP-internal elements appear in a reverse order from the unmarked order as in (18f). Ťhínkwá-wo all-6 ma-sa:ngu 6-sleeping.mat PO: ι ( φ ( φ (ni-xava φ (ma-ntsó:ngó) φ−FocP ) φ φ (Ťhínkwá-wo ma-sa:ngu) φ ) φ ) ι 'I buy ALL SMALL sleeping mats' (marked word order: Adj-Q-N) e. Ni-xava SM1S-buy hínkwá:-wo all-6 ma-sangu 6-sleeping.mat ma-Ťntsó:ngó 6-small PO: 'I buy ALL SMALL SLEEPING MATS' (marked word order: Q-Adj-N) Following Carstens (2008) and Clemens and Bickmore (2021), we assume that DP modifiers such as adjectives, numerals and demonstratives can adjoin to the left or right and we extend this to quantifiers such as hínkwá-wo 'all (-NC6)'. In the case of quantifiers, the unmarked position, as shown in (18a), is for the quantifier (QP) to be right-adjoined, as shown in (19). The marked position of the quantifier is derived by left-adjunction to NumP which we assume like DP has a FocP, as shown in (20). Any single DP modifier that moves and is merged in this position is prosodically marked with PL. The data in (21) and (22) show Xitsonga DPs with two lexical modifiers. This pattern is more complex when details of the word order of the DP with two adjective-like modifiers (an adjective and an adjective-like numeral), as in (21), are considered. In all examples in (21) and (22), the final prosodic word shows penultimate lengthening and is followed by a prosodic boundary. In (21a), penultimate lengthening occurs on the phrase-final prosodic word in the unmarked order. In (21b), after the head noun, the adjective precedes the numeral; the adjective is focused and shows penultimate lengthening. The free ordering of the elements in Xitsonga DPs creates additional possible orders. In (21c), the initial prosodic word is the adjective mantsóngó 'small' and in (21d), the initial prosodic word is the numeral mambirhí 'two'. In these examples, the fronted prosodic word is in a focused position in the syntax and the prosody inherits the focus marking. The fronted element is focus-marked and it is realized with the penultimate lengthening pattern.
(22) a. Ni-xava SM1S-buy ma-ntsóngó 6-small má-mbiŤrhí 6-two má-sá:ngu 6-sleeping.mat PO: ι ( φ ( φ (ni-xava ma-ntsóngó) φ φ (má-mbiŤrhí má-sá:ngu) φ ) φ ) ι 'I buy TWO SMALL sleeping mats' (marked word order: Adj-Num-N) b. Ni-xava SM1S-buy ma-mbirhí 6-two má-sángu 6-sleeping.mat ma-Ťntsó:ngó 6-small PO: ι ( φ (ni-xava φ ( φ (ma-mbirhí má-sángu) φ ma-Ťntsó:ngó) φ ) φ ) ι 'I buy TWO small SLEEPING MATS' (marked word order: Num-N-Adj) The same level of word order freedom as for DP objects is observed with DPs in subject position, as shown in (23a) with unmarked order and in (23b) with the marked order. When a complex DP appears in the preverbal position, the final prosodic word (in these examples, the verb) shows penultimate lengthening. The fronted modifier in (23b) shows penultimate lengthening because it is a φ-max that corresponds to a φ-FocP in PI, which matches FocP in the syntax. The fronted modifier hínkwá-wo 'all-6' is not realized with penultimate lengthening, showing that not all fronted modifiers automatically undergo penultimate lengthening. Instead, hínkwá-wo and the noun form a phonological phrase. The final word in the subject is realized with penultimate lengthening when the complex DP is in the unmarked word order (23a), whereas this penultimate lengthening is not observed when a complex DP shows marked word order (23b). We assume the subject DP is generated in Spec vP and moves to Spec TP (or an equivalent projection) for agreement. Unlike the object DP, the subject projects its own φ in the unmarked case (23a). However, when the subject DP includes a focused element which projects its own φ-FocP the rest of the subject DP is phrased with the verb rather than in its own φ-max (23b). The unmarked phrasing of the subject DP and the realization of different sizes of prosodic phrases for different types of lexical categories are somewhat reminiscent of what Rolle and Hyman (2018) discuss of the Bantu language Makonde (P23) as 'prosodic smothering'. However, in Xitsonga, the linear order of elements plays a role in this also.
(23) a. Ma-sangu 6-sleeping.mat ma-ntsóngó 6-small hínkwá:-wo all-6 Ťmá-tá-dú:rhá SM6-FUT-be.expensive PO: ι ( φ ( φ (ma-sangu ma-ntsóngó) φ hínkwá:-wo) φ φ (Ťmá-tá-dú:rhá) φ ) ι 'All small sleeping mats will be expensive' (((N Adj) Q) V) b. Ma-ntsó:ngó 6-small hínkwá-wo all-6 ma-sangu 6-sleeping.mat Ťmá-tá-dú:rhá 'All SMALL sleeping mats will be expensive' (((Adj) (Q N) V)) In sum, this section has shown that the variable order of DP-internal elements gives rise to varying prosodic realizations. In DPs with two modifiers, an initial head noun forms a phonological phrase with a preceding verb while the following modifiers form their own phonological phrases; penultimate lengthening is absent in the head noun. When one of the modifiers appears in the DP-initial position, that modifier, if focused, is realized with penultimate lengthening, which we propose as being a focus-marked φ in PI that matches with a focus phrase (FocP) in the morphosyntax. The unexpected pattern in (22b) without penultimate lengthening needs further investigation, but we assume that the numeral-noun order is preferred as one prosodic phrase. We now turn to analyzing and discussing the relevant structures in more detail.

Penultimate Lengthening in Focused Elements of a Complex DP
In Section 3, we showed that Xitsonga displays penultimate lengthening in elements in a DP that are focused in non-final positions. In the final position of the intonation phrase, penultimate lengthening is obligatory due to a higher prosodic phrase boundary, a pattern that is well documented for Bantu languages, including Xitsonga (Kisseberth 1994). This study offers a window into the part of Xitsonga grammar that involves penultimate lengthening in these non-final positions: our proposal is that within a complex DP, penultimate lengthening marks focused elements when FocP is matched to a focused-marked phonological phrase, which is argued to take the status of a φ-FOC. The obligatory penultimate lengthening in the sentence-final position is argued to be due to the presence of φ-MAX (a φ that is not dominated by another φ, cf. Ito and Mester 2009) 5 , rather than an intonational boundary. If the source of penultimate lengthening is a specified φ (either φ-FOC or φ-MAX), then the distribution of penultimate lengthening in Xitsonga may have a unified source.
Let us illustrate our analysis with the examples in (24) which feature a complex DP with two modifiers. The unmarked order of the post-verbal complex DP is noun-numeraladjective. In (24a), the verb and the head noun of the complex DP form a phonological phrase and so do the two modifiers. The prosodic grouping is enforced by the BINARITY constraint in (12c). The two φs form a phonological phrase which is dominated by another φ phrase (a φ-MAX). Penultimate lengthening occurs in the final word of the φ-MAX. When the modifier is moved to FocP in the syntax and the fronted element is a focus-marked φ as in (24b), penultimate lengthening is expected. This phrasing violates (STRONGSTART) (cf. 16a), but the violation is forced by a higher-ranked constraint that bans deletion of a focus-marked φ.
(27) MAX-φ-FOC Assign a violation mark when a φ-FOC of PI is not present in PO (cf. a specific version of MAX-φ in Lee and Selkirk (2022, p When a complex DP has multiple modifiers, the penultimate lengthening of a fronted modifier such as ma-ntsó:ngó 'small' in (24b) is due to the pressure to avoid deletion of a Focus-marked φ in the input. In the syntax, the fronted modifier is in FocP (see 9) and that information is inherited in a prosodic constituent in the phonological input by focus marking fronted elements with FOC. This proposal is restrictive and only modifiers fronted to the FocP will show penultimate lengthening.
An alternative analysis is that focus-marked phrases show penultimate lengthening due to prosodic promotion (Ishihara 2019) of prosodic words either to an intonational phrase or a φ-MAX phrase. This analysis assumes that prosodic words will be promoted to a higher level, but with penultimate lengthening patterns it is not immediately clear which prosodic domain a focus-marked prosodic word would be promoted to or what restricts the promotion of prosodic levels.

Does Prosody Directly Refer to Syntax?
Whether the prosodic component in the grammar can directly refer to the syntactic output or not has been a subject of debate in the literature about the prosody-syntax interface. We maintain that the prosody only makes indirect reference to the syntax (following Selkirk 2019), but further assume that phonological input includes prosodic structures that mirror morphosyntactic output (following Lee and Selkirk 2022). The penultimate lengthening in the non-final position of Xitsonga complex DPs suggests that this pattern is possible. Penultimate lengthening in the non-final position refers to a syntactic position (i.e., the specifier position of the Foc DP phrase) that is mirrored in the phonological input as a focus-marked φ. H tone spreading also shows indirect reference to the syntactic structure; thus, both penultimate lengthening and H tone spreading are only sensitive to prosodic structures that are in the phonological output. If our analysis is on the right track, what we observe in Xitsonga may support the following observation: "prosodic marking of focus in Bantu languages involves the establishment of focus-related phrasal domains that are only indirectly conditioned by focus" (see Downing and Hyman 2016, section 41.3.3).

Implications for Bantu DP Structure
While the left and right adjunction accounts introduced in Section 1.1 can account for some of the flexible word orders in the Bantu DP, they cannot account for more complex DPs in variable orders (e.g., with several adjectives and/or multiple types of quantifiers). Allowing more flexibility requires either a number of FOC and TOP projections (cf. Aboh 2004) or arguing for another type of movement such as scrambling or prosodic movement that applies to φ-phrases (as argued for Japanese in Agbayani et al. 2015). We will argue based on phonological data that two of these processes happen in the Xitsonga DP. Aboh (2004) argues for an extended DP for Gungbe, as in (30), which has focus and topic projections in its left periphery based on the discourse nature of DP distinctions such as definiteness and specificity.

.]]]]]]]]]
To account for our data and the rest of Southern Bantu, we proposed a different structure, where Foc DP P is just above D and where partial raising allows for the noun to appear in intermediate positions (which we do not all spell out here). Adjectives, numerals and quantifiers are all phrasal categories and any of these phrases can move to SpecFoc DP P if they are in focus. This kind of structure (shown in 31) can account for the one-to-one match of penultimate lengthening for one modifier in the pre-nominal position. We propose that these "reorderings" which do not affect meaning are caused by scrambling (cf. Hiraiwa 2010). We have seen that seemingly two types of movement can take place in the Xitsonga DP: movement to single focused-position above D which is available to any DP modifier in Southern Bantu, and another type of movement to lower positions which give rise to the complete flexibility in DP word orders observed in Xitsonga. The second type of movement is not associated with meaning changes, nor with extra penultimate lengthening; therefore, we argue that this is scrambling. One might think that alternative accounts could explain this kind of freedom but, for example, a treatment of most modifiers as relative clauses cannot explain their movement to the pre-nominal position since there is no evidence that Bantu languages allow relative clauses with non-initial heads and an account where these clauses would be headless relatives would not work when an overt noun is present (for overviews of Bantu relative clause syntax, see Henderson 2006;Ngonyani 2001;Zeller 2004).

Conclusions
In this paper we presented syntactic evidence from Xitsonga which shows that prenominal modifiers have flexible word order in a pattern that appears to be widespread in Southern Bantu languages, refuting the frequent claim that only a single modifier of certain categories may appear prenominally in Bantu DPs or that Bantu DPs are always noun or demonstrative/quantifier-initial for this subgroup of Bantu languages. Our Xitsonga data show that in Southern Bantu there are two types of processes causing this: syntactic movement moving a single phrase to SpecFoc DP P and scrambling-type movement creating free order in the post-nominal and pre-nominal domains of the Xitsonga DP. Our prosodic data show that syntactic movement to SpecFoc DP P is mirrored in the phonological input as a focus-marked phonological phrase, which is marked by penultimate lengthening in the output. This focus-marked phrase is linked to focus readings but no other semantic or morphosyntactic effects. We propose that this is due to Xitsonga grammar requiring focus-marked phonological phrases to be overt and therefore they need to avoid deletion due to grammatical pressures such as STRONGSTART and BINARITY. Further research will be needed to see if the focus-marked prosodic phrasing pattern is as consistent across Southern Bantu as the syntactic pattern.

1
As is customary in much of the literature on Bantu languages, we use the so-called Guthrie code to unambiguously identify each Bantu language we refer to in parentheses where we first mention it in the text. We follow the codes as given in (Maho 2009). The data in this paper was created by a native Xitsonga speaker consultant from South Africa. The data is described in more detail in Section 3. 2 Refer to the appendix sections for the abbreviations used in glosses. 3 In fact, Clemens and Bickmore (2021) note that several modifiers can appear prenominally in Rutooro, again as in Southern Bantu. Unfortunately, Clemens and Bickmore include no examples or discussion of these kinds of structures in Rutooro. 4 The DP in the MSO in (10a) is not matched to a phonological phrase in the PI. In Xitsonga, only lexical phrases in the MSO match to a phonological phrase in the PI. Details of the reasoning can be found in Lee and Selkirk (2022).