A “borderline case” of syntactic variation

Dialectal maps of morpho-syntactic phenomena sometimes display patterns that either differ quite drastically from the traditional dialectal boundaries (which are mostly based on phonology or the lexicon) or show otherwise unexpected patterns. This paper argues to take these deviations seriously, namely as a potential tool to detect the different types and qualities of syntactic micro-variation. As a case study, differing patterns concerning the distribution of the infinitival marker zu across various infinitival constructions within the Alemannic dialect group will be examined and it will be argued that an analysis of the infinitival marker as the lexical realization of the [±coin] value within a theory of temporal anchoring, as proposed in Ritter & Wiltschko (2014), provides the necessary flexibility in order to capture these differing variational patterns.


Introduction: Isoglosses and syntactic variation
The areal distribution of morpho-syntactic phenomena against the background of the traditional isoglosses is at first sight not expected to be fruitful for syntactic theorizing. The reason is that traditional isoglosses were established mainly on phonological and lexical criteria, i.e. the levels of linguistic description which used to be essentially the sole focus in the traditional dialectology, see Glaser (1996). But morpho-syntactic phenomena show interesting areal clusterings and patterns as well. 1 Although syntactic phenomena often cover larger areas than those based on phonology, they nevertheless reflect these traditional isoglosses in many cases. On the other hand, they sometimes cross-cut the established ones and show patterns that are hard to reconcile with what we know about the spreading of linguistic features, see Poletto (2013) for similar observations. Poletto talks about "leopard spot-patterns". This term is meant to capture the situation where in an otherwise homogenous distribution of linguistic features, a definable area suddenly shows up with different properties. Auer (2005) discusses several cases from German dialects where a phonological divergence of formerly similar dialects can be observed. He argues plausibly that these changes were induced by new political, i.e. artificial, borders. 2 Thus, that dialectal areal variation is prone to deliberate conscious decisions on the part of the speakers can be taken as a well-established fact. Transferred to morpho-syntax, the null hypothesis would be that if the patterns coincide with the traditional isoglosses, we have good reasons to take the variant spoken in this area as a well-established language system. We should be able to capture this kind of variation by using the standard tool box of generative syntax. That means that we can assign differing specifications of a functional item which is in turn responsible for the different syntactic outcomes, much in line with the Borer-Chomsky conjecture, see Baker (2008). As such we can label it as parametric variation. In the conception of hierarchically organized parameters by Biberauer et al. (2012), we would talk in these cases at least of micro-parameters. I thus follow the early ideas of Moulton (1968) that well-established and stable isoglosses are essentially motivated language-internally. 3 But if they differ from the established ones and instead follow artificial borders, for example national ones, outer-linguistic factors should be assumed to play a role, like, for example, group identity or in contrast delimitation from other groups, see again Poletto (2013) for similar considerations. However, for outer-linguistic factors to be able to play a role in morpho-syntax, the phenomenon in question must be of such a type that the differing outputs do no harm to the rest of grammar. In the terms of Biberauer et al. (2012) a nano-parameter, affecting essentially only a single lexical item. As will be illustrated below, such patterns indeed occur, typically with lexical items of the non-functional lexicon -but as we will see, this situation occurs even in cases that fall under first sight clearly under morpho-syntax. What I will also take into consideration is syntactic free variation, as it has been discussed for example in the context of the word order variability in 3verb-clusters, see Barbiers (2005), also Seiler (2004), and Sapp (2011). Sapp (2011) shows under a diachronic perspective that there seem to be indeed no relevant syntactic factors that determine the choice of a given order. And as we will see, the areal variation attested with 3verb-clusters in Alemannic shows exactly such a weird pattern -although a bit different from pure lexical variation, as will become clear in the following. This kind of variation can be traced back to underspecification in the sense that the syntactic derivations of the various orderings are all equally costly -if one takes economy into account.
If these considerations are on the right track, unclear or inconsistent areal patterns of certain phenomena should not to be dismissed as unusable data. Rather, they may give us the relevant clues for untangling those phenomena that can be properly treated (and explained) with the help of formal parametric syntax from those that have been termed in former times as "superficial" or "mere PF-variation".
The data to be discussed stem from the Alemannic dialect group. Alemannic is particularly well suited for this endeavor. First, it is very well documented and parts of its grammar have been described and analyzed in modern linguistic terms as well as in traditional work. Thus, we can be quite sure that the attested variation is not an artefact, e.g. due to insufficient descriptive work. Second, the sub-divisions of Alemannic have grown diachronically and have remained very stable to this day. The most important aspect for the issue here however is that these dialectal sub-divisions cross-cut the political borders in a likewise stable way. High Alemannic shares all relevant linguistic features to count as one sub-dialect -although it is spoken in parts of Germany and in parts of Switzerland. Finally, it should be noted that Standard German is shared (passively) by all these speakers. Possible interference effects can thus be easily controlled for.
In order to use the differing areal patterns as a heuristic tool for the detection of possible candidates for the different types of variation discussed above, it is first necessary to After having established these three types of variation, section 3 is devoted to a case study of the infinitival marker zu. By considering its (non-)occurrence, shape and position in (i) purpose clauses, (ii) infinitival complements of the forget/try class of verbs (simultaneous non-propositional) in the terminology of Wurmbrand (2014)), and (iii) tough-constructions, it will turn out that it exhibits all three types of variation discussed above. This is in sharp contrast to Standard German -where in all three cases the infinitival marker shows up in a uniform way. In order to capture this situation, it will be proposed that the infinitival marker is indeed meaningful, specifically that it is the overt expression of the temporal relations between the infinitival complement and the matrix. In a nutshell, the presence of zu indicates that the two events do not (or not completely) overlap in their temporal expansion. In all other cases, no marking shows up. This distinction divides modal/perception verbs from propositional taking verbs, and it will provide a basis for the also diachronically attested variation within the forget/try class. The latter class is known to show variable behavior also in other areas and this will be traced back to its special temporal composition, based on the analysis by Grano (2011). The proposal is formulated by using Ritter & Wiltschko's (2014) Anchoring Phrase instead of a TP in order to capture the fact that it is not a verbal element -but which nevertheless has influence on the temporal interpretation. The approach will overcome the traditional grammaticalization scenario for the infinitival marker: from (locational) goal via purpose to meaningless marker, see Haspelmath (1989). This is a welcome result, since we will see that this scenario has its weaknesses -both conceptually and also empirically. Finally, section 4 concludes. tic terms. This kind of data allows new analyses to explain why these correlations might hold, see, e.g., Barbiers et al. (2016) for some case studies, also Westergaard et al. (2017). These analyses corroborate the conception of dialectal variation as being qualitatively non-distinct from the syntactic variation found between languages that are genetically less close. The fact that we find less syntactic variation between dialects can be attributed to the fact that the former share a large amount of the historically grown lexicon. Since Borer (1984), it is widely accepted that the location of variation must be sought in the functional lexicon. Therefore, the more of the functional lexicon is shared, the more likely it is to find similar syntactic properties. The difference between macro-and microvariation under this view is thus gradual, see Poletto (2013) for further elaboration of this point. The existence of so-called transition zones, as discussed in Barbiers et al. (2016) and Glaser (2013), corroborates this view further.
However, as said in the introduction, dialectal maps are often not as clear as one would like them to be. Sometimes, no meaningful areal pattern can be detected at all and sometimes the distribution cross-cuts the established regions in the sense of historically grown (sub-)-dialects in an unexpected way. Some cases may be due to language change -dialects are of course also subject to diachronic development. Others may be an artefact of the method used. But, as will be shown below, there are instances of distributions where one cannot deny that "artificial" borders play a role. With "artificial" I refer to national borders 4 that are either the result of political re-ordering in Europe (e.g. Alsace) or that have been politically stable for a long time but where a sub-dialect nevertheless has cross-cut this border for centuries. Such a region is the very bottom of southwestern Germany in which High Alemannic is spoken -a variant that is spoken in large parts of Switzerland as well. This area will be the main focus of the paper. Before considering the morpho-syntactic phenomena under the already discussed areal perspective in more detail, some background information on the sub-division of Alemannic is necessary.

The sub-division of Alemannic
The dialects grouped under the name Alemannic span four different countries: Alemannic is spoken in Southwest Germany, in the northern part of Switzerland, in Vorarlberg, a region in Austria, and in Alsace in France. Alemannic is divided into several sub-dialects. Map (1a) shows the widely accepted division according to Wiesinger (1983). 5 There is a well-known distinction in the vowel system between Swabian and the other Alemannic dialects: Swabian shifts certain long vowels into dipththongs (like Standard German) whereas the other Alemannic dialects kept the older stage of the monophthong. 6 The following two 7 Maps 1a and 1b show the distribution of the diphthong /au/ vs. the monopththong /uu/ in the lexical item braun (=brown), as they were produced by the speakers in a translation task (SynAlm FB1_6-2): As can be seen, the distribution of the monophthongs follows rather strictly the established isoglosses and thus shows that -at least on the phonological level -the isoglosses are still intact. Map 1b uses the same geographical coordinates but instead of marking the dialectal borders, the political ones are displayed. The region south to Freiburg im Breisgau is the one where the dialectal border cross-cuts the political border between Germany and Switzerland. Thus, the isogloss between Swabian and Middle and High Alemannic is not constituted by the River Rhine.
Some more fine-grained sub-divisions within the Alemannic region could be reproduced with data from SynAlm as well, as Map 2a and 2b show. On these maps the results for different versions of the particle ge -used in motion verb constructions are displayed (see the next sub-section for detailed data and discussion). This particle is realized with Map 2: Vowels in gV, Wiesinger map. yellow: ga; orange: ge; brown: gi; blue: go.
Brandner: A "borderline case" of syntactic variation Art. 25, page 6 of 34 different vowels in the various sub-dialects, therefore labeled with gV in the following. The data displayed on these maps stem from a translation task and since this particle does not occur in Standard German, any type of interference effect can be excluded.
Both maps show the stability of the dialectal areas and how they cross-cut the political border. In Map 2a and 2b, the blue dots (=go) appear in the High Alemannic area which is spoken in Germany and Switzerland. These two examples from phonology were meant to show for now that (i) the isoglosses are indeed stable, and (ii) that the data gained by SynAlm are reliable.
The project SynAlm ran for five years during which seven questionnaires were sent out. The number of informants could not be kept constant for the whole period. From about 1000 from the first round, 517 sent back the last questionnaire. Fortunately, these informants covered nevertheless the whole area, but the number of informants per measuring point decreased. The dots in the maps represent the 350 measuring points. As the number of informants per point is not constant, a uniform colouring can also mean that there is only one informant per point. The project itself focussed on (contrastive) fine grained grammaticality judgments. But due to a range of translation tasks, phonological and lexical data were also gained. Most of the data to be discussed in the following 8 stem from judgment tasks.

A case of parametric variation
In this section, I will show that the traditional Alemannic borders, which are based on phonological phenomena, are also relevant for morpho-syntactic patterns. For this, we can consider again the particle gV that typically occurs with motion verbs and has already gained some attention in the literature on Swiss German, e.g. Lötscher (1993), Schönenberger & Penner (1995) or Riemsdijk (2002). An example from Swiss German is given in (1): 9 (1) Swiss-ALM 9 I gang em vadder bim ufflade go hälffe. I go the.dat father at.dat uploading gV help 'I am going to help my father with the loading.' There is not only variation with respect to the form, as illustrated with Map 2 -we find in addition syntactic differences, as discussed in Brandner & Salzmann (2009;. One of the most prominent differences is the relative position of the particle within the infinitival complement. Consider (1) where the particle precedes the arguments of the infinitive -in contrast to (2): (2) German-ALM I gang gi em vadder bim ufflade helfe. I go gV the.dat father at.dat uploading help 'I am going to help my father with the loading.' This difference in position goes hand in hand with differing restructuring possibilities while the interpretational properties as a single event are kept constant. To account for this situation, Brandner & Salzmann (2012) propose two different grammaticalization scenarios for the two languages. They assume with Lötscher (1993) that the origin of the particle gV is a shortened version of the allative preposition gegen (=against, towards) which then came out as gen (=towards), resulting due to word-final n-drop in Alemannic in gV. As can be seen in Map 2, the sub-dialects of Alemannic differ in the vowel. Due to the low vowel in Swiss-ALM, the preposition is reanalyzed as the verb gehen (to go) which is homophonous in its infinitival form with the particle (go/ga). Exactly because of this similarity, the construction is often referred to as "verb doubling" in the literature, see e.g. Schönenberger & Penner (1995) and Riemsdijk (2002). The particle thus occupies in fact a verbal head within the VP, with a syntax akin to those of modal verbs. 10 This analysis is corroborated by the fact that in Swiss-ALM this kind of verbal doubling has spread to other verbs, e.g. choo (come), lo/lasse (let) and aafange (start).
This reanalysis did not happen in German-ALM. The infinitival form of gehen is the same as in Swiss-ALM, go/ga. As this form does not match with the vowel of gi/ge, a relevant pre-condition for the reanalysis process is not met. Consequently, ge/gi has kept its prepositional character. As such it is situated in a left-peripheral position of the whole infinitival projection, preceding the arguments (if present). This fits neatly with the fact that we have no attestations of spreading of verb doubling to other verbs in German-ALM. Brandner & Salzmann (2012) discussed only data from Zurich German and from the region around Lake of Constance. With the comprehensive data from SynAlm, the expectation is now that the form of the particle correlates with its position: those variants that use go/ga should have the particle in a low position within the VP, as a part of the verbal complex, whereas the others would posit it preferably at the left edge of the infinitival complement. This expectation is borne out, as Map 3 illustrates. 11 10 As Postma (2014) correctly points out, the reanalysis cannot be complete since we would then expect that go/ga can also appear at the very end of the verbal complex -as is the case, e.g., with auxiliaries, contrary to fact. Further research is necessary to settle this issue. 11 The relevant data were gained by using a judgment task on a 5-point scale (1 = completely natural in the dialect -5 = unnatural in the dialect). Sentences were presented with gV either preceding the argument or immediately preceding the infinitival verb, cf. the examples in (1)  Of special interest is the marked region north to Basel. In this area, High Alemannic is spoken. Recall that this is the sub-dialect that cross-cuts the political since hundreds of years. And as we can see, it indeed patterns more closely with the Swiss-Alm version.
A further indication that we are dealing with a parametric difference is the usage of gV as a preposition in some of the Alemannic sub-dialects. As mentioned above, there is consensus that gV evolved out of the common Germanic preposition gegen (>gen). The special property of gen is that it occurs only with place names. According to the DWB, 12 it was common until the 16 th century and has since then been gradually replaced by the prepositions auf or nach: (3) Early New High German (Luther, tischr. However, in some sub-dialects of Alemannic, the preposition gen has survived and is used actively by the speakers. Map 4 shows the acceptance of gV in Alemannic preceding a place name. Note first that the area in which gi is possible as a preposition cross-cuts the established distinction between Middle Alemannic and parts of Swabian while on the other hand in Switzerland, this usage is virtually non-existent. Such a pattern is typical for lexical variation and we will see more examples of it below. But here, it additionally correlates clearly with the (im-)possibility of reanalyzing the original preposition to a verbal head. Although I do not have area-wide data, informants from this region mentioned several times explicitly 13 that they can use gi/ge only with intransitive verbs. Translated into structural terms, this means that these speakers only use gi/ge only with a nominalized form of the verb, without any functional structure. In the northern part of Swabian, gi is not used as a preposition and gi can occur at the left periphery of a more expanded infinitival complement to a higher extent. In the dialects in Switzerland and crucially also in the High Alemannic part in Germany, the prepositional use is not possible. The category label in these variants has eventually changed from P to V -with the relevant consequences for the syntax, specifically that it occurs left-adjacent to the lexical verb, forming a verbal complex.
In sum, the just-sketched case of syntactic variation shows all the traits that we would expect from parametric variation in the sense that there is a difference in the functional lexicon which has consequences for the outer syntax of the construction. Furthermore, the distribution in the relevant case concurs with the established subdivision of Alemannic.

The pattern of lexical variation
As was discussed above, variation in the (non-functional) part of the lexicon is expected to show patterns that deviate in a more unsystematic way from the established boundaries. A lexical content item can easily be borrowed and integrated into a language without touching the systematic parts of the grammar. Therefore, whether a certain item is borrowed (and actively used) is up to the individual speaker in the end. And whether it spreads or not is an issue of socio-linguistic questions and thus again outside of the grammar. The lexical item that illustrates such a situation is the Standard German verb verwenden (utilize).
The task was to translate the sentence in (6): (6) Standard German dass Goldschmiede am liebsten Gold verwenden, … that goldsmiths at best gold use 'Goldsmiths use preferably gold.' Although verwenden is a very common verb in German, it is felt by many dialect speakers to belong more to the written style. And indeed, the informants replaced this verb to a large extent. The bulk of replacements consisted of either brauchen (utilize) or nehmen (take). The areal distribution is depicted in Map 5: 187 informants out of 529 (35%) used verwenden: the map reveals that the percentage is higher in German-ALM. However, this aspect tells us merely that this verb is transferred quite easily into the dialect -if it is offered in the stimulus sentence.
More interesting is the sharp division between brauchen and nehmen. Brauchen is used exclusively in Swiss-ALM. Even in the High Alemannic area in Germany, which patterned together with Swiss-ALM in the gV-construction, this lexical item is not found. Note furthermore that there is no single occurrence of brauchen in Vorarlberg, a region that also patterns usually more with Swiss-ALM. But note the asymmetry: verwenden occurs in Swiss-ALM as well as nehmen. The areal distribution of the latter -most of the occurrences are found near the border to Germany -suggests that this is due to language contact. Thus, Swiss-ALM speakers show interference effects for both verbs. On the other hand, German-ALM does not use brauchen. Thus, the conclusion is that brauchen in this context belongs exclusively to the lexicon of speakers of Swiss-ALM. Compared to the maps that were discussed in the previous section, the historically grown sub-divisions of Alemannic are obviously not the decisive factor.
This situation is expected if the considerations from above are on the right track. Lexical change or borrowing produces areal patterns that do not coincide with the established sub-divisions. And this is due to the fact that the lexicon is consciously accessible by the speakers and thus a matter of free variation.

Clustering on the map: conventionalization
In this sub-section, I will discuss an instance of the situation described under 2. in the introduction. This pattern is somehow in-between, as it does not follow the traditional sub-division, but nevertheless shows rather clear areal distributions in terms of clustering of certain variants. The phenomenon in question is the order in 3verb-clusters. The variation attested in 3verb-clusters (either in German or Dutch dialects) is one of those topics where keywords like "surface variation", "optionality", and "under-determination" show up regularly. The reason is simple: nearly all possible linear orderings of the three verbs are attested in the various West Germanic dialects, see Schmid/Vogel (2004) and Wurmbrand (2006Wurmbrand ( , 2017 for a detailed and comprehensive overview about the data and theoretical approaches in recent years. Seiler (2004) detected an interesting areal pattern of variation in Swiss-ALM: he shows that the attested patterns can be modeled by assuming that head-finality increasingly vanishes from East to West. Such that the harmonic 123 orders 14 occur only in the West. He shows that there is an inclusiveness relation and models it in an Optimality theory framework. Another prominent work on this issue in the context of the discussion here is Barbiers (2005). Confronted with a comparable situation in the Dutch dialects, he concludes that the actual choice of a specific order 15 in a given area is not driven by the grammar, but must instead be delegated to outer syntactic factors. Finally, Sapp (2011) argues for the same point on the basis of diachronic data and shows that the factors that may have influence on a given choice of order are all outersyntactic and thus that the variation cannot be captured as parametric variation.
With this background, let us have a look at rather recent data from SynAlm. The data I will discuss in this section are actually a "by-product" of a translation task that originally aimed at the form of the complementizer in temporal clauses. The sentence of this task is given in (7), (FB2_Q_12-3). 16 (7) Standard German Als ich gehen wollte, kam Otto gerade when I go wanted, came Otto just 'When I wanted to leave, Otto arrived at the same moment.' Alemannic does not have a simple past; thus the modal verb inflected for past tense in Standard German was transformed by our informants into a finite auxiliary (have) and an infinitival form of the modal verb, 17 giving rise to the patterns of the 3verb-clusters in (8) and compare with Map 6: a. …habe wollen gehen 123 (aux-mod-lex) light blue on the map b. …habe gehen wollen 132 (aux-lex-mod) dark blue on the map c. …gehen habe wollen 312 (lex-aux-mod) dark green on the map d. …gehen wollen habe 321 (lex-mod-aux) light green on the map These were the only orders produced by our informants in this task. Since the original task did not aim at the order in verbal clusters, there were neither contrasting orders presented nor judgments or preferences asked. The informants produced these 3verb-clusters spontaneously and thus were completely unbiased.
First, Swiss-ALM shows a completely uniform behavior in that only the order aux-modlex (123) occurs. In the Northwest there is a clear preference for clusters of the 312 type, in which the lexical verb is in initial position but immediately followed by the auxiliary. The order aux-lex-mod (132), on the other hand, clusters in the Swabian area. Vorarlberg shows a preference for an initial position of the lexical verb as well. Recall that Vorarlberg belongs politically to Austria -a country in which Bavarian is the predominant variant. Bavarian does not show the pervasive predominance of verb raising as Alemannic or Dutch, although it is attested to a certain amount and in fact the 312 order seems to be the most common one, see Patocka (1997). The order lex-mod-aux (321) occurs exclusively in the northernmost area. This is an area that does not belong to the Alemannic dialect group. 18 So far I have not discussed data from Alsace. The reason is that the number of informants is rather small and is thus not representative. However, the fact that the order in the verbal cluster is one of the few phenomena where the Alsatian speakers show completely uniform behavior is striking. If we take into consideration that Alsatian speakers are in general bilingual -with French as the other language -the strong preference for the 123 order does not come as a surprise. Recall that this is the only grammatical possibility in French. That means, we have two regions (Vorarlberg and Alsace) where obviously language contact plays the crucial role for the actual outcome of the order in a 3verb-cluster. This finding corroborates the analysis proposed in Barbiers (2005). Note that language contact seems to be at stake as well in the northern region where the 321-order "trickles down" into the Swabian area. 321 is the Standard German order and we can be rather sure that all dialect speakers are confronted with this version to the same extent. Nevertheless, the more to the south, the less the speakers produce it. Thus, language contact plays a role in the choice of the possible orders. But this is not due interference from Standard Germansince otherwise it would be hard to explain why this order is confined to a certain area. Recall that the interference effect concerning the lexical item verwenden did not show any regional clustering of this type.
I will refrain from discussing the syntactic analysis that produces the different variants but refer the reader instead to the above cited work. Important for the discussion here is that the choice of the order is prone to the linguistic environment and thus we can classify this type of variation as conventionalization.
A final remark on Swiss-ALM. That the Swiss-ALM speakers uniformly produced the 123 order seems to stand in a slight contrast to the findings in Seiler (2004: 380). He reports that 312 orders can actually be found in the East of Switzerland. There are two things to consider: (i) Seilers data basis is much more fine-grained than the one reported here: he makes a distinction between judgment data and preferred variants. The data here in contrast stem from one translation task and are produced "spontaneously". 19 (ii) the informants of SynAlm knew that the questionnaire had been sent to all Alemannic speaking areas. And of course, dialect speakers know about the prominent features of their own dialect and to a certain extent about the differences with neighboring dialects. 18 As can be seen from the Wiesinger map, this area includes the transition zone between Swabian and Rhine-Franconian. This finding is a neat confirmation that verb raising, resp. the variable order in 3verb-clusters is a typical feature of the Alemannic dialects. The reason why this area was included in SynAlm -although not belonging to the Alemannic dialect group -is that it is immediately adjacent to the area covered by the project SyHD, see fn. 1. 19 As much as this is possible in a written questionnaire. But note that only 10% of the speakers reproduced the Standard German version with the simple past, i.e. wollte.
Thus, it is entirely conceivable that speakers in this task deliberately chose the variant that they knew was essentially unique to their dialect. The important thing for the discussion here is that it is exactly this area of the grammar where we we find the leopard spots pattern. In sum, this section has shown that the three differing patterns of areal distribution can be brought together with three distinct types of variation, discussed in section 1: This is a pattern based at first sight on a phonological difference but which had consequences for the syntax in terms of the pre-condition for reanalysis and with it a different syntactic analysis, i.e. parametric variation.

(ii) The distribution of the verbs brauchen/verwenden/nehmen
This is a pattern where the artificial border between Switzerland and Germany played the crucial role. This pattern was found with lexical variation, i.e. a part of the linguistic knowledge that is consciously accessible. (iii) Ordering within the verbal cluster This is a pattern where certain variants cluster but neither follow the established linguistic borders in terms of sub-dialects nor the artificial borders, i.e. a case of conventionalization.
In the next section, I will show that all the three patterns established until now show up within one syntactic phenomenon, namely the distribution of the infinitival marker zu.
The occurrence of the three patterns will then be taken as the starting point for an analysis of the infinitival marker that is able to account for this fact.

Variation with infinitival markers
Standard German distinguishes formally between two types of infinitival complements: those that take a bare infinitive (modal and perception verbs) and those where the nonfinite verb form has additionally the marker zu. The presence/absence of the IM is generally taken to be an indication of the syntactic size of the IC: modal and perception verbs take as their complement only a bare VP whereas the other constructions show some functional structure above the lexical VP, among them at least TP as the host of the IM. This varying size of the functional part is connected to the different restructuring possibilities, as has been shown at length by Wurmbrand (2001). What is also shown by Wurmbrand is that there is no dichotomy for restructuring -as would be expected by the simple presence/absence of the IM. Rather there are different degrees of restructuring: from no restructuring at all (e.g. propositional attitude verbs) via semi-restructuring with the forget/try-class to full restructuring with modal verbs. Thus, the mere presence/absence of the IM does not correspond to the attested distinctions in terms of restructuring.
Moreover, as is known from the discussion of diachronic data, the IM has not always shown this uniform distribution. While modals and perception verbs indeed take uniformly bare VP-complements, inchoative verbs (begin) and simultaneous and irrealis verbs like consider, learn, dare, forget, try sometimes show an IM -and sometimes not, see Ijbema (2002) for Dutch, Demske (1994; for Old and Middle High German. Already Paul (1920: 97, § §333ff) notes the optionality of the IM in Middle High German with this verb class.
Considering the Alemannic data that will be discussed immediately, the situation in this dialect in fact resembles the diachronic stages just sketched: while modal verbs and perception verbs do not have an IM in their complement, there is still a high amount of variation with the forget/try-class whereas the propositional attitude verbs take preferably finite clauses as their complement, see Brandner (2006) for detailed illustration, see also section 3.4.1 for further discussion. Raising, as exemplified in (12), is very uncommon -and at least in German-ALM 20 absent. Therefore, raising will not be exempted discussed.
As the topic of this paper is the areal distribution, the question is now what type of patterns these infinitival constructions show. I will concentrate on the forget/try-type, section 3.4.1; purpose infinitives will be dealt with in section 3.4.2, and finally I will briefly touch the tough-construction in section 3.4.3.
To give a first idea about the amount of variation, consider The alternative to raising highly preferred in Alemannic is a parenthetical construction: (i) Er scheint rechtzeitig zu kommen Standard German He seems in time to come (ii) Er kunnt -schiint's -zittig Alemannic He comes seems it in time Both: "He seems to come in time." 21 The data stem from different projects on the Alemannic syntax. In a first explorative project, only German-ALM speakers were consulted (N = 312). Later the same questions were presented to Swiss-ALM speakers (N = 420). The numbers for the purpose infinitive stem from a different questionnaire with 732 speakers in total. All the data in table 1 were obtained with the same method (written questionnaire) and involved yes/no judgment tasks. These data were already published partially in Brandner (2015).
There is a striking difference between the two countries when it comes to the acceptance of bare infinitives in the infinitival complement (IC henceforth). German-ALM accepts bare infinitives under the forget/try-verbs much more readily. Bare infinitives in purpose infinitives on the other hand are in both countries accepted at the same rate. In toughconstructions on the other hand, we find a very sharp difference. Consider first Map 7 representing the forget/try-verbs. 22 The distribution does not follow the Wiesinger-classification as neatly as in the first example in section 2. Observe that in the Swabian-area as well as in the transition zones between Middle Alemannic and High Alemannic, the acceptance of the bare infinitive is quite high. In the northern parts of the Swabian, it even seems to be the preferred version. In the Highest Alemannic area, in contrast, bare infinitives are not possible at all. In the transition zones we find the expected mixture, with a certain optionality. Still, we can conclude that the areal distribution is essentially based on the dialectal division and not on the national border. It thus seems to be a typical case of what was identified above as the pattern for conventionalization. 22 The type of element that introduces the purpose clause also differs across the Alemannic sub-dialects (für vs. zum). I will come back to that in section 3.5. A rather different picture arises with purpose infinitives, see Map 8. Remember that the acceptance rates were essentially equal in both countries. On the map however, we can see that the bare infinitives cluster in the East of Switzerland; with some additional instances in the transition zone near the German border around the city of Basel. In German-ALM, we find in the Swabian part nearly the same amount of optionality as with the forget/try-class. But note that in the north, the IM is obligatory -much in contrast to the findings with the forget/try-class. This suggests that we are dealing here with two different phenomena -despite the uniformity in terms of the occurrence of zu in Standard German in both types of ICs.
Finally, as could already be read off from the numbers given above, the results for the tough-construction cross-cuts the dialectal border in the way we have seen it from lexical variation, see Map 9. There is neither variation within the transition zone nor a trickling down effect into other areas of Switzerland. In face of these patterns and with the background on the areal patterns from above, the initial hypotheses are the following: • with forget/try-verbs, the pattern resembles that of the ordering in the 3verbclusters -with some regions showing a uniform behavior and others with some inherent variability. As such, it seems to be a case of conventionalization and the task is to find out a possible grammatical basis for this variability. • with purpose infinitives, the pattern results either from a parametric difference, given the quite clear-cut regions, or it is again an instance of conventionalization -since they cross-cut in addition the established isoglosses. The task is then to search for additional factors that may co-vary with this distribution of the IM such that a parametric variation can be justified independently. • with tough-constructions, the distribution patterns with the cases of lexical variation. This is unexpected since we are dealing for sure with a grammatical phenomenon. So the question in this case is whether we are dealing with a completely different phenomenon than the more typical instances of infinitival complementation and that thus the IM is of a different nature than in the other infinitival constructions.
These issues will be dealt with in the next section.

The infinitival marker
In this subsection, I will briefly give an overview about the common views on the nature and development of the IM before we enter the discussion about the variation in the contemporary dialects. It is a widely accepted scenario that the IM in the West Germanic languages evolved out of the allative preposition zu/to/te and grammaticalized into an infinitival marker, see Haspelmath (1989), also Paul (1920, §345). The general idea is that the notion of goal which is entailed in this preposition can easily be transferred to the notion of future and/or irrealis. Semantic bleaching of the preposition then allowed it to occur in contexts without a future meaning such that it is now even compatible with factive verbs like e.g. regret -which refer to an event in the past. Ijbema (2002), in her detailed discussion and analysis of the role of the IM te in Dutch, looks into the diachrony of Dutch te and finds an astonishingly similar situation to the one in contemporary Alemannic. Again, the critical cases are verbs of the forget/try-class where the IM is just like in contemporary Alemannic optional. Ijbema models this optionality by assuming that it is a side effect of the not yet completed grammaticalization of the preposition te into an IM. According to her analysis, te first realized the [irrealis] component within the IC. The bleaching mentioned above is accompanied by re-categorization to a particle and this made it possible for te to occur in T. This is the syntactic position where it is located in contemporary Dutch. Transferred to the other Germanic languages with a similar development, this scenario is much in line with the work on Tense in infinitivals. Stowell (1982) already suggested to posit a tense-node within infinitives, also Enç (1987), Landau (2004), but see Wurmbrand (2014) for a different view which will be taken up below.
Still considering the diachronic issue, there is abundant evidence that zu-marked infinitives increased during the history in all the West Germanic languages. Traditionally, it is assumed that the increase of zu-infinitives is due to a stabilization process in which the bare (or optionally marked) infinitives were replaced by zu-infinitives. But Los (1999) shows for English that the increase of the IM is rather due to the fact that the finite subjunctive complements of e.g. propositional attitude verbs were replaced by the marked infinitive and that the bare or optionally marked infinitives were affected only to a rather low degree. Smirnova (2016) presents a detailed diachronic investigation of manipulative (force, compel) and directive verbs (order, advice). First of all, zu-infinitives are found with these verbs already in Old High German, see also Demske (2001), whereas bare infinitives occur rarely with these verbs. Second, the spreading of the zu-infinitive is again due to the replacement of the former finite clauses in this environment. Thus, she confirms Los' (1999) findings for German, namely that zu-infinitives replaced finite/subjunctive complements and not bare infinitives. The seemingly grammaticalization process consists thus merely of a higher percentage of zu-infinitives in these environments. The spreading of zu-infinitives to additional environments, due to the grammaticalization path postulated by Haspelmath (1989) and Ijbema (2002) and triggered by semantic bleaching of the preposition, can therefore not be confirmed. These findings cast doubt on the grammaticalization scenario sketched above, especially that the increase of zu-infinitives is connected to the semantic bleaching of the preposition, since it could replace the subjunctive embedded clause from the beginning on, even under verbs that should show this behavior only in later stages, if the semantic bleaching is a crucial factor in the development.
Considering in addition the contemporary variation with respect to the absence/presence of the IM with the forget/try-class in Alemannic -which is found in Middle Dutch and in Old and Middle High German as well -the conclusion is that what has been stable for hundreds of years is the variation concerning the presence/absence of the IM with the complements of forget/try-class of verbs. In other words: we seem to have detected an area in the grammar where underspecification is at stake in the sense that two different realizations are possible in nearly equal terms. And this variability 23 seems to be systematic, given the persistence of the optionality of the IM. Therefore, one should seek a principled reason why this verb class is so special. As a final remark closing this section, recall that in standardized German only the zu-infinitive is licit, cf. (11). This requirement is probably more an artefact of the standardization process itself. During this process one of the two possible versions was chosen as the "correct" one for the written standard variant. Under this perspective, we are dealing with a case of conventionalization, this time driven by normative grammarians. In sum, the diachronic evidence as well as the areal patterns in the contemporary dialect hint at conventionalization, as defined under 2 in section 1 and illustrated in section 2.4.
Looking at the different types of verbs listed in (10)-(15) with respect to the presence of the IM, what immediately comes to mind is that it is connected to the simultaneity of the two events described in the matrix clause and the IC. The temporal extension of modal and perception verbs clearly coincides with the temporal extension of the embedded event. In contrast, e.g. with a propositional attitude verb like promise, the two events, the one of promising and the one described by the embedded verb, can be temporally divided. If zu is situated in T 0 , the non-simultaneity can be accounted for simply by the presence of a T-projection in the IC with its own temporal references whereas there is no such functional layer above the lexical verbal projection in the other cases. But as we have already seen, the situation is more complex and such a simple dichotomy is not sufficient: on the one hand we have the forget/try-verbs with their inherent variability, as just discussed; on the other hand, purpose infinitives and tough-constructions do not fit into the simultaneity picture either. Note that a purpose clause for example may be adjoined to a nominal expression and its future orientation is completely independent from the temporal expansion and the 23 Note that it is exactly this class of verbs which shows variable behaviour in Standard German, when it comes to restructuring, see Wurmbrand (2001) for discussion. Thus, the obligatory insertion of zu did obviously not change the syntactic properties, if we take the varying restructuring possibilities as an indication for the "in between" status of these verbs also in terms of the functional layers present. Under this perspective, it might be worthwhile to reconsider the purely structural arguments in terms of the presence of syntactic layers that allow (or prevent) restructuring. But a full discussion of restructuring is beyond the limits of this paper.
lexical meaning of the matrix verb. Tough-constructions like this book is easy to read have more of a generic interpretation since the IC denotes merely a property and thus the question of simultaneity does not arise in the same way as with modals and other IC-selecting verbs. Therefore, a new perspective on the role and the semantic contribution of the IM is called for. Such an approach will be sketched in thext section, followed by an application of it to the single cases of ICs whose areal patterns gave rise to the suspicion that the uniformity of the IM and its clear-cut distribution in Standard German rather camouflages its diversity.

Anchoring and the feature [coin]
In the following, I will use the universal spine approach, as proposed in Wiltschko (2014) and preceding work by her with Ritter, e.g. Ritter & Wiltschko (2009. The important insight from this work is that Tense -as we know it from the Indo-European languages -is just one instance of the possibilities to anchor an event. Based on a detailed analysis of data from various languages, which are reported to have no tense distinctions in our understanding, they suggest that instead of temporal values (past, present, future), deictic local expressions (proximal vs. distal) or 1./2. person vs. 3. person pronouns 24 may be suitable to anchor an event as well. They posit the universal category Anchoring Phrase and suggest that its featural specification consists only of the basic feature [coin] for coincidence. Coincidence distinguishes whether an event is coincidental with a contextually or linguistically given anchor or not. As such, it captures already the above discussed notion of simultaneity in a direct way and we will see immediately that the system gives us the needed flexibility to account for the situation described in the previous section. A general structure of anchoring is given in (16) whereby arg sit is an abstract representation of the two contents to be related: In a language without tense-marking, the Anch 0 -head hosts a lexical item (locative, person) that will deliver an equivalent basis for the computation of the plus or minus value of [coin], e.g. the proximal preposition for [+coin]. Embedded finite and non-finite clauses have an AnchP as well. The difference is that the situational argument in the specifier is in this case anaphoric (pro) and it is bound by the event time of the matrix AnchP: The value of the embedded Anch 0 of an infinitive is according to Ritter & Wiltschko determined by the lexical properties of the matrix predicate. This type of assigning a value is called predicate valuation and it means that the matrix verb determines the form of the infinitive (specifically whether it shows up with an IM or not) via lexical selection. They then suggest that simultaneous verbs like try and begin value for [+coin] whereas propositional attitude verbs as well as manipulative for example for [-coin]. Thus, they adhere to a pure lexical semantic approach. As for the role of zu, they remain silent. The way it is implemented, the system can deal again only with the strict dichotomy, for which we have seen that it is not adequate. Thus, a slight modification is necessary. But before going into the further discussion, a few remarks concerning the notion of coincidence. It goes back to Hale (1986). Hale argues that this binary distinction is found in essentially all areas of the grammar and that it is the one basic distinction around which grammars of natural languages are built. Besides the temporal system (including aspect) and the pronominal system, 25 the difference between locative and directional prepositions can be subsumed under this notion as well. Interestingly, the core example in Hale (1986) for the illustration of coincidence and non-coincidence is the difference between the prepositions to and at. At is the proto-typical representative of coincidence, other examples are by, in, along, over, past, through, and with, see Rapoport (2014) for further discussion. To 26 on the other hand, due to its inherent complex semantics is an example of non-coincidence. Which kind of local relationships does to entail? First, there is a fixed location, i.e. the goal like the store in run to the store. Remember from above that this is the component that is taken as being responsible for the future/irrealis interpretation of the IC -which then subsequently got bleached. Secondly, there is in addition a path (motion) involved, namely from the (not necessarily defined) position where the path starts until the goal is reached. Thus, the preposition zu entails a temporal expansion. It is this latter component which I will take to be the relevant one for its suitability as an IM: only if there is a notion of temporal expansion present, coincidence with an event can be computed in a meaningful 27 way. These considerations make it superfluous to posit a semantic bleaching process for this preposition in order to account for the cases where no goal (i.e. future/irrealis) is involved. And as was discussed in section 3.1, there are serious problems with this assumption in any case. Furthermore, recall that with verbs that take typically an IC (manipulative verbs), the zu-infinitive was already present in the oldest texts. If I am right, there was no slow belaching process, rather, the preposition was used for lexicalizing the [coin] value from the beginning on. And this is not due to the notion of goal but rather that zu (like de in Romance) lexically involves a temporal expansion. It is this meaning component which makes it suitable for lexicalizing the [coin] feature. Now recall that in the Ritter & Wiltschko system the value for the [coin] is not necessarily bound to the verbal system. There are languages that use regularly prepositions in order to anchor the event. I see no reason why languages should not mix the systems and use for the embedded infinitival [coin] feature a preposition as the overt realization (m-valuation) of the respective value. And as just discussed, the preposition zu is very well-suited for this task since it brings in a temporal expansion which is necessary for computing the [coin] value.
The suggestion is thus that in the case of a [-coin] value in a finite clause with the subspecifications for past and future, the Germanic languages use the verbal inflection and the Anch-head is realized as a functionally extended verbal head (=traditional TP). In case of an infinitival complement, a preposition is the m-valuation of the [coin]. This means we do not have to adhere to predicate valuation, as Ritter & Wiltschko do. Note that if predicate valuation is assumed to be responsible for the valuation, there is no way to account for the distribution of the IM in a systematic way. But if the Anch-head shows m-marking on its own, similar to a finite clause, predicate marking in the sense that it directly determines the form of the complement via selection is superfluous. That the [coin] value of the IC matches with the lexical meaning of the matrix verb is then a matter of a "compatibility check" rather than syntactic selection. This will give us the necessary flexibility when it comes to the forget/try-class.
Based on this, I will suggest the following valuation rules: i zu for the [-coin] value ii. zero marking for the [+coin] value Note that zero-marking for the [+coin] value does also hold for finite tense: present tense, i.e.
[+coin], in the Indo-European languages is equally not marked morphologically. This parallelism makes the proposal attractive, since such a parallelism cannot even be formulated if one sticks to outer-linguistic notions like goal etc. for the reason why zu is used in ICs. The proposal is furthermore much in line with recent considerations on the role of tense in infinitivals in Wurmbrand (2014). Wurmbrand argues extensively that there is no syntactic TP in ICs, in contrast to the more traditional treatments of ICs. According to her, three different types of infinitives have to be distinguished: i. future infinitives (like want, expect): the future interpretation is due to the presence of a modal-like projection (wollP) but which crucially does not involve a tense-value.
ii. simultaneous propositional verbs (like claim and believe): the reference time of the infinitive is related to the attitude holders NOW. iii. simultaneous non-propositional verbs (like the try/forget-class): they form a single temporal domain with the matrix verb.
The main diagnostic is the (in-)compatibility with time frame adverbials like tomorrow, yesterday, which are computed based on the reference time of the matrix clause. The first class allows these adverbials (Today, John expects to leave tomorrow) whereas the other two classes show restrictions. Specifically, the third class does not allow such an adverbial at all (*John tried/began to leave tomorrow) whereas with the second class, an adverbial is possible if there is an additional aspectual marking (John claimed [to have left/to be leaving/*leave at three]. As I am mainly concerned with the third class, whose temporal properties will be discussed in section 3.3.1 in more detail, I will only briefly mention that the two other classes can be captured quite easily in the system here: for future infinitives, the value is clearly [-coin] and the occurrence of zu is expected. As for class II., Wurmbrand shows that the temporal expansion of the attitude holder's NOW and that of the embedded event do not coincide (there is either shifted past or an imperfective in which the attitude holder's NOW is included) and thus again zu-marking is expected.
Concerning the infinitival marker, Wurmbrand (2014: 414, fn 8) explicitly remains neutral when it comes to its position and function in ICs. However, she states that zu is certainly "not a tense element". In addition, according to her, it seems pointless to give it a unified semantics in view of the fact that it is compatible with so many different meanings, cf. the different types of infinitival constructions illustrated for German above. But with the considerations about its meaning contribution in terms of coincidence and the ingredients just proposed, the undertaking is may be not that hopeless.
In sum, I assume instead of a TP an AnchP 28 with its [ucoin] feature to be valued. Note that the assumption that the head of AnchP is category neutral in the sense that it might host prepositional or nominal categories as well, allows us to posit the preposition within the clausal projection without the necessity to postulate a process of re-categorization to a particle. In this sense, we can again depart from the traditional grammaticalization scenario. In the next section, I will now discuss how the three different classes of verbs for which we have seen the varying areal patterns can be analyzed in the system proposed here.

The feature [coin] in infinitival complements
For the sake of completeness, let me just mention that the complements of modal and perception verbs now are expected to not show an IM. From a semantic point of view, it seems uncontroversial that they cannot have a different temporal expansion than the event described by the VP. And as mentioned above, throughout the diachronic development as well as in contemporary Alemannic, the ICs of these verbs are bare infinitives.
As a reviewer points out, the system proposed here, in which the IM is indeed meaningful, seems problematic for a verb like wollen. This verb construes like a modal verb with a [+coin] value and thus without an IM. The verb wünschen (wish) however has essentially the same semantics but construes as a future infinitive with an IM and thus has a [-coin] value.
First note that wollen is the only modal verb that can alternatively combine with a finite clause in Standard German: (18) a. Hans will mehr Geld verdienen Hans will more money earn 'Hans wants to earn more money.' b. Hans will, dass er mehr Geld verdient Hans will that he more money earns 'Hans wants/wishes that he earns/would earn more money.' Such an alternation is not possible with the other modal verbs. This shows already that the modal wollen has entered the lexicon of German with a different entry, namely as an alternative to wish/desire. Of course I admit that these verbs are quite close in terms of their lexical semantics. But this is not necessarily an argument against treating zu as a meaningful element the way I sketched it above. Close or nearly identical lexical content can often be expressed by using different syntactic constructions. Wollen is a modal verb that is very close to an interpretation as a future infinitive, especially since the ordering source does not come from the outside. Thus, it would even be expected to be able to occur with an IM -which it does not (yet?).
As the reviewer further notes, there is another problematic case where both types of ICs are possible: this is brauchen (need) in German. In its negated form, it is interpreted as a modal verb and consequently it can occur also without the IM: Du brauchst das nicht (zu) tun you need that not (to) do 'You don't need to do that.' Although I claimed above that modal verbs are temporally coincident with the embedded event, this is in fact not that obvious as it is for example with the perception verbs. The reason is that modalization brings in additional aspects than the pure temporal relation between two Anch-heads in terms of coincidence. Thus, I agree that a satisfying treatment of modal verbs requires a much more careful treatment in order to capture the different further semantic effects of modalization. It may very well be that the reason for their not having zu-complements must be found in fact somewhere else. 29 However, again, this is then still not an argument against treating zu as a meaningful element. I will leave the discussion about these further aspects of modal verbs for future work. Instead I will turn now to the question how to analyze the simultaneous non-propositional class (try, forget) -exactly that class that shows the variable behavior systematically -and not dependent on different readings.

Try and its kin
Consider first the following examples where in one case an achievement verb and an activity verb is combined with either a modal verb or try: 30 (20) Achievement verb a. Ich versuchte stundenlang ihn zu erreichen. try + ADV ok I tried for hours him to reach 'I tried to reach him for hours.' 29 What immediately comes to mind is that the verbal complement of modals does not have any functional structure at all (thus also no AnchP) and that they are directly inserted in an extended functional projection of the verb, see again the various work by Wurmbrand. In this case then, the absence of the IM would follow trivially. 30 With the can-modality, the temporal adverbial is possible. However, the interpretation in this case is the availability/disposal interpretation. c. Ich musste/durfte/konnte (stundenlang) arbeiten mod + ADV ok I must/was allowed/was able (for hours) work 'I had to/was allowed to/was able to reach him (for hours).' (20b) is out because the temporal adverb for hours is combined with an achievement verb, which does not have a temporal expansion. But when construing it with try, the adverb becomes possible. Adding a modal verb does not lead to this effect. This effect shows that, whichever further factors come into play with modal verbs, at least their temporal expansion is immediately dependent on the properties of the lexical verb, cf. the brief discussion in section 3.3. With activity verbs, no differences can be observed in either version. So the relevant contrast is between (20a) and (20c). Grano (2011) suggests that try brings in an additional temporal component, which he calls the preparatory phase. And it is obviously this preparatory phase that can be modified by the adverb. The preparatory phase is seen as a (mental) state that turns into an inner stage as soon as the event described by the embedded verb has started. Importantly, when using try, the preparatory phase has already begun, i.e. one part of the complex event is realized already. This is the important difference to e.g. want where there is no such intimate connection. Want is a typical future infinitive in the sense of Wurmbrand (2014) which is compatible with two different time frame adverbials, i.e. yesterday, I still wanted to leave tomorrow. This is not possible with try, cf. *I tried yesterday to meet you tomorrow. 31 A similar temporal structure can be posited for forget -although reversed in a sense since there is of course no preparatory phase. But still a mental state must be assumed that covers the inner stage of event the same way as with try. Thus, forget itself also has a temporal extension of its own -independent from that of the embedded lexical verb. When considering phase verbs, similar considerations apply and the same is true for dare. The important point is that in contrast to modal and perceptions verbs, there are two distinguishable phases with these verbs, but wshich are nevertheless temporally tied to each 31 Ijbema (2002: 103) cites an example, where this seems to be possible (translated from Dutch to German): (i) Gestern hab ich versucht ihn morgen nicht treffen zu müssen. Yesterday have I tried him tomorrow not meet to must 'Yesterday, I tried to not have to meet him tomorrow.' I agree that the example is somehow acceptable, but this is obviously due to coercion: the correct paraphrase would be: Yesterday, I tried (to make arrangements) such that I don't have to meet him tomorrow. Grano (2017) discusses several cases of coercion in uncommon uses of try.
other: one where the event and try-state coincide and one where there is only a try-state in its form of the preparatory phase. The claim is now that exactly this special property is the basis for the optionality of the IM with these verbs: both specifications of the infinitival complement are compatible with the special temporal properties of these verbs -and thus it is expected that both versions occur to nearly the same extent.
Let us then come back different to the areal patterns. We have just seen that the overall temporal organization of the forget/try-class is inherently flexible. Thus, the point where variation from a semanto-syntactic point of view is to be expected could be identified. The question now is: how can we explain that in some areas the variability is directly reflected in the acceptance rates 32 (recall that 39% in German-ALM accept no IM with forget), whereas in others, only the version with an IM is possible, cf. the Highest Alemannic region? Assume the following scenario: as we know from the diachronic data, the variation between IM and zero-marking with these verbs is a stable property of the ICs of these verbs. Thus, the language acquirer was and is confronted with variable input. But due to the special semantics of this class of verbs, either value for [ucoin] is compatible with the temporal organization and the optionality of the IM is tolerated by the grammar. Under this perspective, the question is rather why does the IM-marker occur so regularly in the Highest Alemannic regions, i.e. why do we not find the same amount of variation there? If we consider the maps for the forget-type, it is obvious that contact plays a role, i.e. the closer a Swiss German region is to Germany, the more the zu-less infinitives are accepted. Now note that the account proposed above does not per se exclude that there is a uniform marking with these verbs -even without a prescriptive grammar. In High Alemannic, the zero-form was obviously less common at a certain point in time. The variational input seized and the possibility of a bare infinitive finally died out.
In sum, the variational pattern that is found with the absence/presence of an IM in the forget/try-class of verbs is a typical instance of conventionalization in the sense that both values are compatible with the temporal structure of these verbs; in some areas, this situation is reflected directly by the near free variation between the different versions -in others, a uniform pattern is preferred -but there is no indication that either choice has further consequences in terms of interpretation.

Purpose infinitives
Let us now look at the zu-less infinitives in purpose clauses. First recall the facts: the possibility of having a zero-marked infinitive covers by and large the same area. In German-ALM there is a high amount of variation, i.e. speakers can choose rather freely whether an IM occurs or not. In Swiss-ALM on the other hand, the area in which there is free variation is confined to the East. At first sight, this is a nice corroboration of the varying input scenario that we have seen above for the forget/try-verbs, as language contact seems to play a role. But we will see immediately that things are different here and that we are dealing with a "well-behaved" case of syntactic variation such that the possibility of variation is tied to the functional specification of the various complementizers used in a given variant.
A purpose infinitival clause is not selected by the matrix verb. Instead, it modifies either a nominal expression (I took x in order to achieve something (with x)) or it modifies the event expressed by the VP (I did x in order to achieve something (via x)). Thus, we might ask if the issue of temporal relation and thus the [±coin] is relevant at all in these cases.
But clearly, purpose clauses occur with ICs and the reason is that the notion of purpose already entails that the purposed event is always a (relative) future irrealis -irrespective of the lexical meaning of the verb in the matrix clause. Now given this, the value in a purpose infinitival clause is always [-coin]. For this reason, they should always show up with an IM -contrary to the facts found in Alemannic. Recall that about 20% of the speakers accepted a zu-less infinitive in these constructions. Now purpose infinitives are distinct from other ICs not only in that they are not selected but in addition, they are introduced by a complementizer; in Standard German, this is um: German Ich habe zu wenig Kleingeld, um eineFahrkarte zu kaufen I have too less cash in order a ticket to buy 'I don't have enough cash (in order) to buy a ticket.' In Alemannic purpose clauses may be introduced by um as well, but there two alternatives to this complementizer, namely zum or für (for English: in order). The Map 10 33 shows the distribution of these three 34 forms.
The region that uses für is co-extensive with the one where zu-less infinitives are disfavored, cf. Map 8. Thus, these two properties must be related and we should be able to find a syntactic explanation for the differences. 33 The task used in SynAlm for purpose infinitives is identical to a task in SADS. SADS and SynAlm collaborated such that SynAlm took over some of the questions from SADS in order to get the same type of data for Germ-ALM. The results concerning für und zum in Switzerland as presented here are nearly identical to those that are given in Seiler (2005), based on data from the project SADS. Even the effect that speakers using zum accept zero marking more readily in the embedded infinitive could be reproduced. This is a nice confirmation that the data gained with the methods used in both projects are reliable. Seiler (2005) presents a detailed discussion about several aspects of the purpose infinitives and bases on them a general discussion about the nature of syntactic isoglosses. As I am interested here mainly in which way the data fit or contrast with the other infinitival constructions, I refer the reader to his work for more aspects of purpose infinitives in Alemannic. 34 Note that um is scattered all over the Alemannic area. This is probably an interference from Standard German. But note in addition that in the non-Alemannic area to the North, zum is virtually absent. Looking at the numbers for this task, as summarized in Table 2, the correlation is even more obvious: The table summarizes the varying patterns that were obtained from the translation 35 task in FB3-6-3 and which is given in (22). The translations are categorized according to the element introducing the purpose infinitive (für, zum, um) and additionally according to whether or not the infinitival verb is preceded by an IM. About 22% with zum in the left periphery accept 36 a bare infinitive. But less than one percent of those using für or um can have a bare infinitive. Thus, the choice of the initial element plays a crucial role in allowing a bare infinitive or not.
Note that the IM in these cases is zum. I will follow Postma (2014) 37 who suggests that this form is the result of T-C movement of zu which adjoins to um which is in turn base generated as a complementizer in C 0 . This yields in the end the surface form zum. This analysis immediately gives us a clue to account for the attested variation: in those variants 35 This is a cut-out from the SynAlm database, showing the results in the form of a 'datasheet'. The results should be read as follows: The leftmost column gives the forms according to which the translated sentences were categorized. The next column gives the overall percentage of attestations. The further columns distinguish then between the different countries in which Alemannic is spoken: BW: German-ALM; CH: Swiss-ALM, EL: Alsace; VA: Vorarlberg. 36 Note that the bare infinitive is also the preferred one: only 7% use an additional zu-marking in the infinitival complement. This "over-marking" can rather safely be attributed to the influence of the prescriptive standard variant that posits invariably a zu-IM in all kinds of ICs (except modals) immediately preceding the infinitival verb. 37 Postma (2014) develops his analysis based on data from a Pomeranian dialect spoken today in Brazil -which nevertheless shows astonishing parallels to the Alemannic situation also in other respects. He connects the assumed movement of zu to the C-position with a "weak" Tense in this language. Evidence of this are the impoverished paradigms for past tense. Again, the situation in Alemannic is the same. As a close investigation of these issues is beyond this paper, I will leave it for future work. But note, this uniformity in patterns shows that if there is no prescriptive standard variant, in the end, the natural development generates a consistent distribution. tions of tough-constructions, see for example Hicks (2009). I will merely consider those aspects that are relevant for the temporal organization. First, as with the purpose clauses, there is again no direct interaction with the lexical semantics of the matrix verb and the infinitive. This of course has to do with the fact that the only type of matrix verb that occurs in tough-constructions is the copula. A copula merely asserts that a given property holds at a certain time. Under this perspective, the IC in this case denotes a property rather than an event. Thus the question of coincidence is in a way obsolete since there is trivially always coincidence. In the system proposed here, this would mean that we do not expect the IM zu at all. Still, in Standard German, it shows the IM zu, indistinguishable from the IM occurring in the other types of ICs. Again, a more fine-grained analysis of the data is necessary. Until now, I neglected the issue of the form of the IM. Its typical form is zu (which is realized as z' in Alemannic). Another form that we have encountered is zum, e.g. in the C-position in the purpose clauses. In Swiss-ALM, zum in purpose clauses occurs only in the C-position, as is evident from table 2. But note that in German-ALM, 52 informants used this form 40 even in the low position inside the VP, i.e. it seems to be a phonological variant of zu. This impression is bolstered by the observation that the form zum may occur in German-ALM even in ICs under the forget/try-class (acceptance rate 32%) whereas in Swiss-ALM only 6% 41 of the speakers accepted this form. Furthermore, zum in tough-constructions is highly preferred (80%) in Swiss-ALM whereas in German-ALM, either z' or zum or -as seen -even the zero-variant may occur. From this picture we can conclude that zum in German-ALM can be conceived of as a lexical variant of zu whereas in Swiss-ALM, it is restricted to either the initial element in purpose infinitives -where it does not have a pure temporal interpretation -or to tough-constructions. Under this perspective, we are indeed dealing with a case of variation on the lexical level and the unexpected pattern in map (12) could thus be subsumed under the label free variation. But only in the sense that in German-ALM, the surface form of the IM is manifold, including a zero-variant. This would mean that the nature of the zero-marking in tough-constructions is again of a different nature than the variation with the forget/try class respectively with purpose infinitives. In which way the use of the zero-variant in German-ALM indeed correlates with a different syntax must be left open here. Until now only the data reported here are available. Still, the weird pattern with the tough-constructions should lead us to doubt that the seemingly uniformity of tough-constructions with the other types of infinitives in Standard German (and in English) are an indication of a similar syntax.
The data from Swiss-ALM show clearly that the IC in tough-constructions is of a different nature. What immediately comes to mind regarding the form is that the IC in toughconstructions is nominal in nature since the preposition bears an inflection for dative (-m); equivalent with occurrences as in Ich geh zum Arzt (I go to-dative doctor). In German-ALM, this specification seems to be lost (or can be cancelled) as zum can occur in verbal environments. Similar considerations would apply to other occurrences of the preposition zu with a non-finite verb form as in the equivalents to English something to read or Dutch zit te lesen (sit down to read), the latter with a progressive interpretation. It seems again that the non-finite verb in these cases shows rather the traits of a nominal structure rather than a clausal projection. If this is true, then there is no Anchoring Phrase at all. On the other hand, progressive and the modal ability-reading (as in something to read) clearly denote a temporal expansion. Future work in dialectal research should have a close look at the maybe very tiny morpho-phonological differences in the realization of the preposition in these cases. It is highly plausible that similar differences as those found in Alemannic will be detected in the dialects/spoken variants of English and Dutch as well.

Conclusion
The overall aim of this paper was to show that it is premature to dismiss unexpected or weird areal distributions of morpho-syntactic phenomena in the contemporary dialects as not useful. Instead, these deviations were taken as a starting point for a deeper exploration of the syntactic properties of the phenomena in question. What we have seen is that the conventionalization pattern with the ICs under the forget/try-class is (i) diachronically stable and (ii) finds a natural explanation when considering more closely the temporal organization of these constructions. On the other hand, the distribution of the infinitival marker in purpose clauses could be explained by adhering to a classical micro-variational analysis in terms of a different feature specification of the lexical items involved. Finally, the indeed unexpected pattern with tough-movement -resembling the variational patterns found in the lexicon -can be taken as a motivation to re-think its syntactic analysis. As more and more data of this kind presented here will come up in the years to come, I hope to have shown that it is worthwhile to have a close look especially at those patterns that deviate from what is expected. It seems that these patterns can give us insights into the nature of (syntactic) variation that were previously not possible -due to the lack of this kind of data.