Chapter 6.1 Introduction

Malcolm Ross and Andrew Pawley and Meredith Osmond

Map 1.1: Oceanic and non-Oceanic Austronesian languages — **Map 1.1:** Oceanic and non-Oceanic Austronesian languages

1. Aims¹ ⇫

This is the sixth and last of a set of volumes on the lexicon of the Proto Oceanic (POc) language.² POc was the immediate ancestor of the Oceanic subgroup of the Austronesian language family. This subgroup consists of all the Austronesian languages of Melanesia east of 136˚ E, together with those of Polynesia and, with two exceptions, those of Micronesia—around 500 languages in all (see Map 1.1).³ Extensive arguments for the existence of Oceanic as a clearly demarcated branch of Austronesian were first put forward by Dempwolff (1927, 1937), and the validity of the subgroup is now recognised by probably all scholars working in Austronesian historical linguistics.

The development and break-up of the POc language and speech community were stages in a truly remarkable chapter in human prehistory—the colonisation by Austronesian speakers of the Indo-Pacific region in the period after about 2000 BC. The outcome was the largest of the world’s well-established language families and (until the expansion of Indo-European after Columbus) the most widespread. The Austronesian family comprises more than 1,000 distinct languages. Its eastern and western outliers, Madagascar and Easter Island, are two-thirds of a world apart, and its northernmost extensions, Hawai‘i and Taiwan, are separated by 70 degrees of latitude from its southernmost outpost, Stewart Island in New Zealand.

Map 1.2: Geographic limits of historically identified Oceanic speakers and presently documented Lapita sites — **Map 1.2:** Geographic limits of historically identified Oceanic speakers and presently documented Lapita sites

A strong school of opinion associates the subsequent break-up of POc with the rapid colonisation of Island Melanesia and the central Pacific by bearers of the Lapita culture between about 1200 and 900 BC (see Map 1.2 and vol. 2, chapter 2).

The present project brings together a large corpus of lexical reconstructions for POc, with supporting cognate sets, organised according to semantic fields and using a standard orthography for POc. We hope that it will be a useful resource for culture historians, archaeologists and others interested in the prehistory of the Pacific region. The comparative lexical material should also be a rich source of data for various kinds of purely linguistic research, e.g. on subgrouping (as in §1.8 and §1.9), phonological change, semantic change and semantic structure (e.g. colexification) in the 500 or so daughter languages.

Volume 1 of The lexicon of Proto Oceanic reconstructs terms associated with material culture. Volumes 2, 3 and 4 examine relevant sets of cognate terms that provide insights into how POc speakers viewed their environment. Volume 2 deals with the geophysical or inanimate environment, and volumes 3 and 4 treat plants and animals respectively. Volume 5 and the present volume return to terminologies centring on people. Volume 5 concerns gender and age, the body, and human conditions and physical and cognitive activities that arise from nature rather than nurture. The present volume concerns culturally learned structures, including social organisation, beliefs in the supernatural, the seasons of the year, counting and other elements of non-material culture.

A consideration of the totality of our reconstructions across volumes 1 to 5 has led to an unexpected reassessment of the origin of Proto Oceanic (§§1.8–1.9) together with a small revision to its phonology (§1.8.2.4).

The editors had intended to provide a seventh volume that would perform several functions. It would treat closed classes of lexical roots; review the project’s main findings concerning Proto Oceanic speakers’ culture and environment and compare these findings with what archaeology tells us about the way of life and environment of the bearers of the Lapita culture. Some of these matters are partially folded into the chapters of the present volume, e.g. social anthropology into chapters 3 and 4, archaeology into chapter 5 and archaeogenetics briefly into chapter 15. Two factors have led to the decision not to proceed with volume 7 and to make this the last volume. Firstly, the editors are now octogenarians and would like to live somewhat less hectic lives. Secondly, and importantly, funds have been provided by the (Australian Research Council’s) Centre of Excellence for the Dynamics of Language to set up a publicly accessible electronic database of the reconstructions from the six volumes along with their supporting data, thereby fulfilling at least the purposes of the cumulative indexes intended for volume 7. It will also provide a locus for updating the project’s findings and for additions by other scholars.

This introduction follows a similar path to that taken in earlier volumes, but deviates in §1.8 and §1.9 to outline the fresh insights into the prehistory of Proto Oceanic itself based on the reconstructions in volumes 1–5. Section 1.2 gives an overview of this volume’s contents. and §1.3 summarises its relationship to previous work. Section 1.4 examines the issues that arise in reconstruction. It falls into four main subsections. Subsection 1.4.1 sketches our approach to reconstruction. Section 1.4.2 is a brief introduction to sound correspondences. The third, §1.4.3 looks at the kinds of language grouping found in Oceania, as this bears on the validity of our reconstructions. Section 1.4.4, sets out the criteria that we apply in making a reconstruction, and our answers to the challenges this raises. In section 1.5 we briefly explain the conventions used in the cognate sets that make up much of this and previous volumes. Section 1.7 brings us to Proto Oceanic itself and presents its phonology as it has been understood until now, and the two orthographies that have been used to represent it. After a short note on POc morphology in §1.6, section 1.8 takes us—we think appropriately in this our final volume—to the study of Proto Oceanic phonology and origins based on volumes 1–5 (Ross, in prep.). The results are summarised in §1.9.

2. The present volume⇫

Inspection of the table of contents shows that the chapters in the present volume vary hugely in length. Each chapter concerns a semantic domain. For some of these domains—kinship, seasonal cycles, counting—we found a wealth of data and were dealing with internally structured closed classes of lexemes whose presentation required numerous tables (and diagrams in the case of kinship). For other domains—the spirit world, measurement—there was limited lexical material, and for yet others—*mana, *tabu—the author chose to limit domains to single concepts considered by others to be key cultural concepts in the Oceanic lexicon.

Chapters 2 to 5 of this volume are concerned with POc speakers’ social organisation. Chapter 2 is a detailed reconstruction of kinship terms and structures. Chapter 3, by the late Per Hage, is a slightly edited and abridged version of a paper first published in The Journal of the Polynesian Society in 2007. It complements chapter 2 by using evidence from disciplines other than linguistics to answer the question, “Was POc society matrilineal?” Chapter 4 returns to a much discussed issue, reconstructing terms associated with chieftainship and rank in POc and examining the consequences of this reconstruction.⁴ Chapter 5 uses reconstructed terms to investigate POc speakers’ settlement patterns.

Chapter 6 concerns the probable recreational activities of POc speakers, looking at music, song, dance and games.

In chapters 7 to 10 we turn to topics that have to do with belief systems and the supernatural. Chapter 7 concerns the beings that inhabited the POc spirit world. Chapters 8 and 9 both deal with human manipulation of the supernatural. Chapter 8 takes a broad look at magic, while chapter 9 focuses on the reconstruction of PEOc *mana, a term that has been much discussed by Pacific anthropologists and denotes the pervasive supernatural power given by ancestral spirits to certain powerful individuals to ensure their success. Chapter 10 analyses the meanings of the POc term *tabu, which has reflexes throughout Oceanic. It meant ‘prohibited’, but in certain EOc languages it also attributes an aura of sanctity to the ‘prohibited’ person or object.

Chapter 11 investigates in some detail the way that Oceanic speakers have referred to the cyclic nature of time and have used the sun, moon and stars to regulate the agricultural cycle.

The terms that POc speakers used to refer to various aspects of speech are the subject of chapter 12.

Chapter 13 reconstructs terms that had to do with trade and more generally with change of possession: giving, receiving and stealing. It also introduces the practice of ceremonial exchange, which plays a role in chapter 14 on counting. There it is argued that the POc decimal counting system and its associated complexities were kept alive by their use in ceremonial exchange feasts. Chapter 15 suggests that POc may also have had a digit-tally system used in everyday counting. One counting complexity covered in chapter 14 is the use of numeral classifiers, and chapter 16 deals with the subset of classifiers used in measurement.

Appendix A lists the data sources employed in this volume. Appendix B lists the languages from which data for this and previous volumes are drawn. It includes alternative names of languages, an index to languages, maps showing their approximate locations, and a list of their ISO codes, glottocodes and longitudes and latitudes.

3. The relation of the current project to previous work⇫

Reconstructions of POc phonology and lexicon began with Dempwolff’s pioneering work in the 1920s and 1930s. Dempwolff’s dictionary of reconstructions attributed to Proto Austronesian (PAn) (Dempwolff 1938)—but equivalent in modern terms to Proto Malayo-Polynesian (PMP)—includes some 600 reconstructions with reflexes in Oceanic languages.

Since the 1950s, POc and other early Oceanic interstage languages have been the subject of a considerable body of research. However, relatively few new reconstructions safely attributable to POc were added to Dempwolff’s material until the 1970s. In 1969 George Grace made available as a working paper a compilation of reconstructions from various sources amounting to some 700 distinct items, attributed either to POc or to early Oceanic interstages. These materials were presented in a new orthography for POc, based largely on Biggs’ (1965) orthography for an interstage he called Proto Eastern Oceanic. Updated compilations of Oceanic cognate sets were produced at the University of Hawai‘i in the period 1977–1983 as part of a project directed by Grace and Pawley. These compilations and the supporting data are problematic in various respects and we have made only limited use of them.

Comparative lexical studies have been carried out for several lower-order subgroups of Oceanic: for Proto Polynesian by Biggs (resulting in Walsh & Biggs 1966, Biggs, Walsh & Waqa 1970 and subsequent versions of the POLLEX file, including Biggs & Clark 1993, Clark & Biggs 2006 and online as Greenhill & Clark 2011); for Proto Micronesian by scholars associated with the University of Hawai‘i (Bender et al. 1983, 2003a, 2003b); for the ancestor of the Banks and Torres languages by Alexandre François (several unpublished manuscripts); for Proto North and Central Vanuatu by Clark (2009); for Proto Southern Vanuatu by Lynch (2001c); for New Caledonia by Ozanne-Rivierre (1992), Haudricourt & Ozanne-Rivierre (1982) and Geraghty (1989); for Proto SE Solomonic by Levy (1980) and Lichtenberk (1988); for Proto Central Pacific by Hockett (1976), Geraghty (1983, 1986, 1996, together with a number of unpublished papers); for Proto Eastern Oceanic by Biggs (1965), Cashmore (1969), Levy (1980), and Geraghty (1990); and for Proto Central Papuan by Pawley (1975), Lynch (1978a, 1980), and Ross (1994a).

Robert Blust (1970, 1980a, 1983-84a, 1986, 1989) of the University of Hawai‘i, in a series of papers published extensive, alphabetically ordered, lexical reconstructions (with supporting cognate sets) for interstages earlier than POc, especially for Proto Austronesian, Proto Malayo-Polynesian and Proto Eastern Malayo-Polynesian. He has also written several papers investigating specific semantic fields (Blust 1980c, 1982b, 1987, 1994). Blust & Trussel had a major work in progress, the online Austronesian Comparative Dictionary (ACD), which brings together Blust’s reconstructions for Proto Austronesian and lower-order stages up to mid 2020, when the sudden death of Steve Trussel, who was responsible for the web interface and data input, brought this work to a sudden halt. With the passing of Robert Blust in January 2022, the ACD was bequeathed to Alexander Smith and found a new home with the Cross-Linguistic Linked Data project, where hopefully it will continue to grow.⁵

Several papers predating our project systematically investigated particular semantic domains in the lexicon of POc, e.g. Milke (1958b), French-Wright (1983), Pawley (1982a, 1985), Pawley & Green (1984), Lichtenberk (1986), Walter (1989), and the various papers in Pawley & Ross (1994). Ross (1988) contained a substantial number of new POc lexical reconstructions, as well as proposed modifications to the reconstructed POc sound system and the orthography. However, previous Oceanic lexical studies were limited both by large gaps in the data, with a distinct bias in favour of ‘Eastern Oceanic’ languages, and by the technical problems of collating large quantities of data. Although many languages in Melanesia remain poorly described, there are now many more dictionaries and extended word lists, particularly for Papua New Guinea, than there were in the 1980s. And developments in computing hardware and software now permit much faster and more precise handling of data than was possible then. A list of sources is found in Appendix A.

Several compilations of reconstructions have provided valuable points of reference, both inside and outside the Oceanic group. We are indebted particularly to Bender et al. (2003a, 2003b), two editions of POLLEX (Biggs & Clark 1993 and Clark & Biggs 2006), Blust & Trussel (2020), Clark (2009) and Lynch (2001c).

In the course of planning the several volumes of the present project, we came to realise that the form in which preliminary publications were presented—namely as essays, each discussing cognate sets for a particular semantic field at some length—would also be the best form for the presentation of this set of volumes. A discursive treatment of individual terminologies, as opposed, say, to a dictionary-type listing of reconstructions with supporting cognate sets, makes it easier to relate the linguistic comparisons to relevant issues of culture history, language change, and methodology. Hence each of the present volumes has as its core a collection of analytic essays. Some of these have been published or presented elsewhere, but are included here in revised form.

In some cases we have updated the earlier versions in the light of subsequent research, and, where appropriate, have inserted cross-references between contributions. Authorship is in some cases hard to pin down, as a number of people have had a hand in collating the data, doing the reconstructions, and (re)writing for publication here. In most chapters, however, one person did the research which determined the structure of the terminology, and that person appears as the first or only author, and where another or others had a substantial part in putting together the chapter they appear as the second or further authors.

4. Reconstructing the lexicon⇫

4.1. Terminological reconstruction⇫

Our method of doing ‘terminological reconstruction’ is as follows. First, the terminologies of present-day speakers of Oceanic languages are used as the basis for constructing a hypothesis about the semantic structure of a corresponding POc terminology, taking account of (i) ethnographic evidence, i.e. descriptions of the lifestyles of Oceanic communities and (ii) the geographical and physical resources of particular regions of Oceania. For example, by comparing terms in several languages for parts of an outrigger canoe, or for growth stages of a coconut, one can see which concepts recur and so are likely to have been present in POc. Secondly, a search is made for cognate sets (§1.4.2), i.e. words from different languages that appear to be descended from the same protoform, from which forms can be reconstructed to match each meaning in this hypothesised terminology. The search is not restricted to members of the Oceanic subgroup; if a term found in an Oceanic language proves to have external (non-Oceanic) cognates, the POc antiquity of that term will be confirmed and additional evidence concerning its meaning will be provided. Thirdly, the hypothesised terminology is re-examined to see if it needs modification in the light of the reconstructions. There are cases, highlighted in the various contributions to these volumes, where we were able to reconstruct a term where we did not expect to do so and conversely, often more significantly, where we were unable to reconstruct a term where we had believed we should be able to. In each case, we have discussed the reasons why our expectations were not met and what this may mean for Oceanic culture history. We have set out to pay more careful attention to reconstructing the semantics of POc forms than has generally been done in earlier work, treating words not as isolates but as parts of terminologies.

Blust (1987:81) distinguishes between conventional ‘semantic reconstruction’, which asks, “What was the probable meaning of protomorpheme X?”, and Dyen and Aberle’s (1974) ‘lexical reconstruction’, where one asks, “What was the protomorpheme which probably meant ‘X’?” At first sight, it might appear that terminological reconstruction is a version of lexical reconstruction. However, there are sharp differences. Lexical reconstruction applies a formal procedure: likely protomeanings are selected from among the glosses of words in available cognate sets, then an algorithm is applied to determine which meaning should be attributed to each set. This procedure may have unsatisfactory results, as Blust points out. Reconstructions may end up with crude and overly simple glosses; or no meaning may be reconstructed for a form because none of the glosses of its reflexes is its protomeaning.

Terminological reconstruction is instead similar to the semantic reconstruction approach. In terminological reconstruction the meanings of protomorphemes are not determined in advance. Instead, cognate sets are collected and their meanings are compared with regard to:

their specific denotations, where these are known;
the geographic and genealogical distribution of these denotations (i.e. are the glosses from which the protogloss is reconstructed well distributed?);
any derivational relationships to other reconstructions;
their place within a working hypothesis of the relevant POc terminology (e.g., are terms complementary —‘bow’ implies ‘arrow’; ‘seine net’ implies ‘floats’ and ‘weights’? Are there different levels of classification—generic, specific, and so on?).

For example, it proved possible to reconstruct the following POc terms for tying with cords (vol.1:290–293):

POc *buku ‘tie (a knot); fasten’
POc *pʷita ‘tie by encircling’
POc *paqu(s), *paqus-i- ‘bind, lash; construct (canoe +) by lashing together’
POc *pisi ‘bind up, tie up, wind round, wrap’
POc *kiti ‘tie, bind’

In each of the supporting cognate sets from contemporary languages there are a number of items whose glosses in the dictionaries or word lists are too vague to tell the analyst anything about the specific denotation of the item, and in the case of *kiti this prevents the assignment of a more specific meaning. The verb *buku can be identified as the generic term for tying a knot because of its derivational relationship (by zero derivation) with a noun whose denotation is clearly generic, *buku ‘node (as in bamboo or sugarcane); joint; knuckle; knot in wood, string or rope’ (vol.1:85–86). Other senses are extensions of this meaning (vol.2:50, vol.5:159, 175–176, 341). Reconstruction of the meaning of *pʷita as ‘tie by encircling’ is supported by the meanings of the Lukep, Takia and Longgu reflexes, respectively ‘tie by encircling’, ‘tie on (as grass-skirt)’, and ‘trap an animal’s leg; tie s.t. around ankle or wrist’: Lukep and Takia are North New Guinea languages, whilst Longgu is SE Solomonic. Reconstruction of the meaning of *paqu(s), *paqus-i- as ‘bind, lash; construct (canoe +) by tying together’ is supported by the meanings of the Takia, Kiribati and Samoan reflexes, respectively ‘tie, bind; construct (a canoe)’, ‘construct (canoe, house)’, and ‘make, construct (wooden objects, canoes +)’: Takia is a North New Guinea language, Kiribati is Micronesian, and Samoan is Polynesian. The meaning of *pisi is similarly reconstructed by reference to the meanings of its Mono-Alu, Mota, Port Sandwich, Nguna and Fijian reflexes.

Often, however, the authors have been less fortunate in the information available to them. For example, Osmond (vol.1:222–225) reconstructs six POc terms broadly glossed as ‘spear’. Multiple terms for implements within one language imply that these items were used extensively and possibly in specialised ways. Can we throw light on these specialised ways? Unfortunately, some of the word lists and dictionaries available give minimal glosses, e.g. ‘spear’, for reflexes of the six reconstructions. What we need to know for each reflex is: what is the level of reference? Is it a term for all spears, or perhaps all pointed projectiles including arrows and darts? Or does it refer to a particular kind of spear? Is it noun or verb or both? If a noun, does it refer to both the instrument and the activity? Most word lists are frustratingly short on detail. For this kind of detail, ethnographies have proven a more fruitful source of information than many word lists.

Another problem is inherent in the dangers of sampling from some 500 languages. The greater the number of languages, the greater are the possible variations in meaning of any given term, and the greater the chances of two languages making the same semantic leaps quite independently. Does our (sometimes quite limited) cognate set provide us with a clear unambiguous gloss, or have we picked up an accidental bias, a secondary or distantly related meaning? Did etymon x refer to fishhook or the material from which the fishhook was made? Did etymon y refer to the slingshot or to the action of spinning round?

4.2. Sound correspondences⇫

Phonological changes, whereby one sound evolves into another, are mostly regular. For example, the initial consonant of the reflexes of the three words below is the same for all three items (and for numerous others).⁶ In each language all instances of initial *p- have evolved “regularly”, i.e., in the same way.

	POc	*papine	*pisiko	*pat[i] ⁷	*p-
		‘woman’	‘meat, flesh’	‘four’
Adm:	Aua	pifine	pirio	—	p-
Adm:	Baluan	pein	pusio	pa-	p-
NNG:	Numbami	—	wiso	wata	w-
PT:	Kilivila	vivila	viliy-na	-vasi	v-
PT:	Yamalele	vavine	viɣo	—	v-
PT:	Sinaugoro	vavine	vi-viɣo	vasi-vasi	v-
PT:	Motu	hahine	hidio	hani	h-
MM:	Tolai	vavina	vio	-vat	v-
MM:	Vaghua	vavene	vəzəɣo	-vac	v-
SES:	Arosi	haihine	hasiʔo	hai	h-
NCV:	Mota	vavine	visoɣo-i	vat	v-
Mic:	Woleaian	faifile	fitixo	faa-	f-
Fij:	Wayan	vavine	viðiko	vā	v-
Pn:	Samoan	fefine	—	fā	f-

The grouping to which each language belongs is indicated by an abbreviation on the left (§1.5.1).

The “sound correspondence” that concerns us here, the initial consonant correspondence, is shown on the right. Reconstructing forms in a protolanguage depends on working out the systematic sound correspondences among cognate vocabulary in contemporary languages and on having a working hypothesis about how the sounds of POc have changed and are reflected in modern Oceanic languages. Working out sound correspondences even for twenty languages is a large task, and so we have relied heavily on the work of others and our own previous work. The sound correspondences we have used are as follows: Ross (1988) for Western Oceanic and Admiralties; Ross (1996a) for Yapese; Ross (1996b) for Oceanic languages of Indonesian Papua; Pawley (1972) for Eastern Oceanic; Levy (1979, 1980) for SE Solomonic and Lichtenberk (1988) for Cristobal-Malaitan; Pawley (1972) and Tryon & Hackman (1983) for SE Solomonic; Ross & Næss (2007) for Temotu; François (pers. comm.) for the Banks and Torres Islands of Vanuatu; Tryon (1976) and Clark (2009) for North and Central Vanuatu; Lynch (2001c) for Southern Vanuatu; Geraghty (1989), Haudricourt & Ozanne-Rivierre (1982), Ozanne-Rivierre (1992, 1995) and Lynch (2015) for New Caledonia; Jackson (1986) and Bender et al. (2003a, 2003b) for Micronesian; Geraghty (1986) for Central Pacific; and Biggs (1978) for Polynesian. We have also done additional work on North and Central Vanuatu and New Caledonia ourselves.

For non-Oceanic languages we have referred to sound correspondences given by Tsuchida (1976) for Formosan languages; by Zorc (1977, 1986) and Reid (1982) for the Philippines; by Adelaar (1992b) and Nothofer (1975) for Malay and Javanese; by Sneddon (1984) for Sulawesi; by Collins (1983) for central Maluku; by Grimes & Edwards (in prep.) for what is conventionally known as CMP; and by Blust (1978a) and Kamholz (2014) for SHWNG.

Regular sound correspondences can be interfered with in various ways: by phonetic conditioning that the analyst has not identified (see, e.g., Blust 1996a), by borrowing (for an extreme Oceanic case, see Grace 1996), or by the frequency of an item’s use (Bybee 1994). We have tried at least to note, and sometimes to account for, irregularities in cognate sets.

4.3. The internal structure of the Oceanic subgroup of the Austronesian family⇫

Figure 1.1 shows nine primary subgroups of Oceanic. Its rake-like structure indicates that no convincing body of shared innovations has been found to allow any of the nine subgroups to be combined into higher-order groupings. Section 1.4.3.1 explains the theory that underlies the formulation of Figure 1.1, which is important to the practice of reconstruction. Sections 1.4.3.2 and 1.4.3.3 offer some commentary on our subgrouping, and in §1.4.4 we explain how our criteria for making a reconstruction and attributing it to a protolanguage are related to subgrouping issues.

4.3.1. Subgroups and linkages⇫

In Figure 1.1 each node is, with one minor exception, either a single language, usually a reconstructed protolanguage, or, in italics, a group of languages. The exception is the two very closely related languages Mussau and Tench.

Figure 1.1: Schematic diagram showing the subgrouping of Oceanic Austronesian languages. — **Figure 1.1:** Schematic diagram showing the subgrouping of Oceanic Austronesian languages.

Where a node is a protolanguage, its descendants form a subgroup. The only descendant languages shown in Figure 1.1 are reconstructed protolanguages, but Appendix B lists by grouping the descendant languages referred to in these volumes. A subgroup is identified by innovations shared by its member languages, i.e. it is ‘innovation-defined’ in the terminology of Pawley & Ross (1995). These innovations are assumed to have occurred just once, in the subgroup’s protolanguage, i.e. the exclusively shared ancestor of its members. Thus languages of the large Oceanic subgroup of Austronesian share a set of innovations relative to the earlier Austronesian stages shown in Figure 1.5. By inference these innovations occurred in their common ancestor, POc, and the claim that they are innovations is based on a comparison of reconstructed POc with reconstructed PMP. The phonological innovations of POc were identified by Dempwolff (1934), and have been somewhat modified by subsequent research (§1.8.1). POc also reflects morphosyntactic innovations (Lynch et al. 2002: ch.4), morphological innovations (e.g. POc acquired a morphological distinction between three kinds of alienable possessive relationship: food, drink and general; Lichtenberk 1985a), and lexical innovations (e.g. PMP *limaw ‘citrus fruit’ was replaced by POc *molis; Lynch 1984).

Italics are used in Figure 1.1 to indicate a group of languages that is not a subgroup, i.e. has no identifiable exclusively shared parent. Thus Southern Oceanic linkage in Figure 1.1 indicates a collection of languages descended from POc (Ross 1988). They comprise the languages of Vanuatu, the Loyalties and New Caledonia, but they do not form a subgroup. There was no “Proto Southern Oceanic”, as no convincing innovation has been identified that is reflected by all Southern Oceanic languages. Nonetheless, there are innovations which chain various, sometimes overlapping, groups of Southern Oceanic languages together (§1.4.3.2). Some of these innovations are inherited, i.e. they define smaller subgroups within Southern Oceanic. Of these, Southern Vanuatu is the best known example (Lynch 2001c:181–184). Others are probably the result of contact between fairly similar languages. The recently discovered fact that there were multiple immigrations by, we take it, speakers of early Oceanic languages probably gave rise to this kind of contact (see the discussion in §15.8.1).

The term “linkage” occurs in several of the italicised labels in Figure 1.1. The distinction between a subgroup and a linkage is important in reconstruction.⁸

A subgroup is defined by a set of coterminous innovations that are inferred to have occurred in its common ancestor (its protolanguage).⁹ By “coterminous” is meant that all the innovations are shared by all the languages of the group.¹⁰ This is the situation in Figure 1.2.¹¹ Languages A and B share a set of innovations and form one subgroup. Languages C–J share another set of innovations and form another subgroup. The processes of language change that give rise to innovations are continuous, meaning that subgroup formation is recursive. Within the subgroup CDEFGHJ are two (sub)subgroups CDE and FG, alongside two languages H and J. This situation can be represented in two ways: by a tree (left) or a maplike representation (right). The tree, like Figure 1.1, also displays the protolanguages from which the languages of each subgroup are inferred to be descended.

Figure 1.2: Schematic diagram of a subgroup — **Figure 1.2:** Schematic diagram of a subgroup

Figure 1.4 shows the same subgroup AB as figure 1, but languages C–J display a pattern of intersecting subgroups.¹² Languages CDEF form a “subgroup” on the basis of a set of coterminous innovations, and languages CDE form one on the basis of a further set. But E and F also share innovations with G, H and J, forming a subgroup EFGHJ that intersects with CDEF. What is more, E and F share further innovations with H and G respectively; that is, E and F each reflect innovations that are coterminous neither with those that define CDEF, nor with those that define EFGHJ. This intertwining of groups formed by intersecting innovation domains is a linkage (an ‘innovation-linked group’ in Pawley & Ross 1995). Its boundary can be defined, but no tree that accounts for all innovations can be drawn. If no tree can be drawn, then no protolanguage can be posited, and, since a reconstruction must belong to a protolanguage, strictly speaking no reconstructions can be made. We return to this matter in §1.4.3.2.

Figure 1.4: Schematic diagram of a linkage — **Figure 1.4:** Schematic diagram of a linkage

Innovations begin as changes that occur in the language of an individual speaker, and some of these changes spread across the community. As long as languages are mutually intelligible, changes continue to spread. Their places of origin, and directions and extents of spread, may differ, so that the resulting innovations are not coterminous but instead intersect. And over time, social relationships may change, so that changes arrive from new origins. The outcome of these processes is a linkage.

However, untangling the history of a linkage is difficult, and sometimes impossible. In the “worst-case” scenario one or more innovations spreads right across the languages of the linkage. In this case it becomes virtually impossible to distinguish it from a subgroup. But returning to Figure 1.4, perhaps EFGHJ in fact reflects innovations that occurred in Proto EFGHJ. If so, then we cannot posit Proto CDEF or Proto CDE. Instead, we infer that at some date relationships were realigned so that speakers of pre-C and pre-D came into intimate enough contact with speakers of Proto EFGHJ or one of its descendants for innovations to pass between them, creating the illusion of a subgroup CDEF. But, with a little thought one could come up with a good number of scenarios that result in the pattern in Figure 1.4, and determining which reflects the actual history can be difficult.

Map 1.3: Oceanic language groups in northwest Melanesia: the Admiralties and St Matthias groups and the subgroups of Western Oceanic. — **Map 1.3:** Oceanic language groups in northwest Melanesia: the Admiralties and St Matthias groups and the subgroups of Western Oceanic.

It is tempting to see a subgroup and a linkage as opposing patterns, but comparison of Figure 1.4 with the righthand diagram of Figure 1.2 shows that a subgroup is a subtype of a linkage, one in which the ranges of innovations happen not to intersect (François 2014:171). Nonetheless, we maintain the distinction between a subgroup and a linkage, as the former reflects a reconstructable protolanguage but the latter does not (or sometimes, as emerges below, does so more weakly).

4.3.2. Oceanic linkages⇫

A number of Oceanic linkages have been recognised by scholars. They include Fijian (Geraghty 1983), the Caroline Islands (Jackson 1983), Vanuatu (Tryon 1976; Clark 1985; Lynch 2000b; 2004d; François 2011b, 2014), NW Melanesia (Ross 1988), the SE Solomons (Lichtenberk 1988, 1994b; Pawley 2011) and E Polynesian (Walworth 2014). In some of these there is evidence for events that would further complicate the description of a linkage in §1.4.3.1.

One such event sequence is indicated in Figure 1.1 by a dashed line around the relevant groups of languages. These are instances of a group of languages undergoing a division and then coming back into contact to form a grouping in a different constellation from before. The best researched of these is the Fijian linkage, which represents the partial resynthesis of the Fiji-based descendants of earlier Western Central Pacific and Eastern Central Pacific linkages after Rotuman and Polynesian had split off from them (Geraghty & Pawley 1981; Geraghty 1983; Pawley 1996c).¹³ Geraghty reconstructed the history of the Fijian linkage by painstaking analysis of innovations from at least two stages in its history. From the earlier period Western Fijian languages share innovations with Rotuman and Eastern Fijian with Polynesian. From a more recent period Western Fijian and Eastern Fijian languages share innovations with each other, reflecting their reintegration into a single linkage, within which the present Western/ Eastern boundary has shifted relative to the (fuzzy) boundary of the earlier period. This kind of process also forms part of the history of the Guadalcanal-Gelic subgroup within SE Solomonic (Pawley & Green 1984).

A linkage sometimes consists of some but not all of the languages descended from a single parent. The Western Oceanic linkage (reflects the innovations of POc, but no innovation is exclusive to the whole of Western Oceanic (although the merger of POc *r and *R comes close). However, the languages of its three component linkages (Map 1.1)—North New Guinea, Papuan Tip and Meso-Melanesian—display complex patterns of intersecting innovations.¹⁴ The WOc linkage is evidently descended from the dialects of POc that were left behind in the Bismarck Archipelago after speakers of the languages ancestral to the other eight primary subgroups in Figure 1.1 had moved away to the north or east (Ross 2014, 2017). After these departures various innovations occurred. Each arose somewhere in the Western Oceanic dialect network and spread to neighbouring dialects without reaching every dialect in the network.

The Southern Oceanic linkage as proposed by Lynch (1999, 2000b, 2001b, 2004d) is characterised by complex overlapping innovations, but by none that are reflected in all its member languages and would qualify it as a subgroup (see discussion in Lynch et al. 2002:112–114).

4.3.3. Oceanic subgroups⇫

Figure 1.1 also shows a number of Oceanic groups for which a protolanguage is reconstructable. By definition these are subgroups. They are Admiralty (Ross 1988: ch.9), SE Solomonic (Pawley 1972:98–110; Levy 1979, 1980, n.d.; Tryon & Hackman 1983; Lichtenberk 1988), Temotu (Ross & Næss 2007; Næss & Boerger 2008; Lackey & Boerger 2021), S Vanuatu (Lynch 2001c:181–184), Micronesian (Jackson 1983, 1986; Bender et al. 2003a), and Papuan Tip (Ross 1992b)

Central Pacific is also a subgroup, but one defined by only a handful of shared innovations, indicating that the period of unity was short (Geraghty 1996). The high-order subgrouping of Central Pacific is due to Geraghty (1983), except for the position of Rotuman (Pawley 1996b). Within Central Pacific is another long recognised subgroup, Polynesian, for which Pawley (1996a) lists diagnostic innovations.

4.4. Criteria for reconstruction⇫

4.4.1. The distributional criterion⇫

The strength of a lexical reconstruction rests crucially on the distribution of the supporting cognate set across language groups. The distribution of cognate forms and agreements in their meanings is much more important than the number of cognates. It is enough to make a secure reconstruction if a cognate set occurs in just two languages in a family, with agreement in meaning, with two provisos. The first is that the two languages belong to different primary groups, and the second that there is no reason to suspect that the resemblances are due to borrowing or chance. The PMP term *(h)abij ‘twins’ is reflected in several western Malayo-Polynesian languages (e.g. Batak apid ‘twins, double (fused) banana’) but, when the reconstruction was made, only one Oceanic reflex was known,¹⁵ namely Roviana avisi ‘twins of the same sex’ (vol. 5, §2.6). Because Roviana belongs to a different first-order branch of Malayo-Polynesian from the western Malayo-Polynesian witnesses (cf Figure 1.5) and because there is virtually no chance that the agreement is due to borrowing or chance similarity, this distribution was enough to justify the reconstruction of PMP *(h)abij, POc *apic ‘twins’.

4.4.2. Which protolanguage? Handling the Oceanic tree’s rakelike structure⇫

Here we deal with two issues relating to the question, To which protolanguage should a reconstruction be assigned? In this section we explain how we handle the rake-like structure of the Oceanic tree in Figure 1.1. In §1.4.4.3 we respond to the fact that a linkage has no identifiable protolanguage (§1.4.3.2).

The rake-like form of Figure 1.1 almost certainly reflects the very rapid settlement of Oceania out of the Bismarcks,¹⁶ but it confronts us with a methodological question. If we follow the standard rubric that we make a reconstruction if a cognate set occurs in languages of just two primary language groups (§1.4.4.1), then reflexes of an etymon in, say, a SE Solomonic language and a Micronesian language would be sufficient evidence for a POc reconstruction and the absence of reflexes in Admiralty and Western Oceanic would be irrelevant. Given what we know about the location of the POc homeland (in the Bismarcks; vol.2, ch.2) and the early eastward spread of Oceanic speakers, this is too loose a criterion. Instead, we assume two hypothetical nodes not shown in the tree in Figure 1.1.¹⁷ These are

Remote Oceanic, comprising Southern Oceanic, Micronesian and Central Pacific;
Eastern Oceanic, comprising SE Solomonic and Remote Oceanic.¹⁸

If a cognate set occurs in two or all three of the groups in Remote Oceanic, the reconstruction is attributed to PROc (PROc). If a cognate set occurs in one or more of the groups in Remote Oceanic and in SE Solomonic, it is attributed to Proto Eastern Oceanic (PEOc). In this way we acknowledge that such reconstructions may represent an innovation that postdates the spread of the early Oceanic speech community. There are enough PROc and PEOc reconstructions to suggest that such lexical innovations indeed occurred. This in turn provides evidence for Remote Oceanic and Eastern Oceanic subgroups, but evidence that is too weak to be relied on, for at least two reasons. First, it is quite possible that some of our PROc and PEOc reconstructions will be promoted to POc as more Admiralty and Western Oceanic data become available. Second, it is reasonable to assume that some of our PROc and PEOc etyma are of POc antiquity but happen to have been lost in Proto Admiralty and Proto Western Oceanic. Without supporting phonological or morphological evidence we are unwilling to treat PROc or PEOc as anything other than convenient hypothetical groups which allow us to retain conservative criteria for a POc reconstruction.

FIXME: relabel PEOc reconstructions in vol1 and vol2?

A reconstruction here labelled ‘PROc’ was labelled ‘PEOc’ in volume 1 or 2, but if it lacks SE Solomonic reflexes, it is labelled as a PROc reconstruction in volumes 3–6. Two factors have led to the distinction between PEOc and PROc in more recent volumes. One is that the historical separateness of SE Solomonic both from Western Oceanic and from groups treated as Remote Oceanic has become increasingly clear through recent research (Pawley 2009). The other, especially relevant to volume 3 on plants and to volume 4 on animals, is that the primary biogeographic divide in Oceania is between Near and Remote Oceania (see vol. 2, Map 5), i.e. between the main Solomons archipelago and the Temotu islands. Whether or not a plant or animal name has a SE Solomonic reflex is thus significant. Many plant names do not, and are thus attributed in volume 3 to PROc.

Our criterion for attributing a reconstruction to POc is that the cognate set must include data from at least two out of three criterial groupings: Admiralties (or Yapese or Mussau), Western Oceanic, and our hypothetical Eastern Oceanic. Both here and at the hypothetical interstages defined above, no reconstruction is made if there are grounds to infer borrowing from one of these groupings to another.¹⁹ We also reconstruct an etymon to POc if it is reflected in just one of the four criterial groupings and in a non-Oceanic Austronesian language (a member of one of the lefthand branches in Figure 1.5), as illustrated above by the reconstruction of POc *apic ‘twins’.

There are indications that Yapese (a single-language “subgroup”) and Mussau and Tench (a subgroup with two closely related languages) may be more closely related to Admiralty than to any other Oceanic subgroup,²⁰ and for this reason they are tentatively treated as Admiralty languages for the purposes of reconstruction. That is, the presence of a reflex in one or more of these languages and in Admiralty does not support a POc reconstruction, but the presence of of a reflex in one or more of these languages and one of Western Oceanic or Eastern Oceanic does support one.

In chapter 2 (§4) of volume 2 Pawley discusses Blust’s (1998a) proposal that the primary split in Oceanic divides Admiralty from a subgroup embracing all other Oceanic languages. Pawley dubs the latter ‘Nuclear Oceanic’. If Blust’s subgrouping were accepted, then an etymon which lacked cognates outside Oceanic would need to be reflected both in an Admiralties language and in a non-Admiralties language for a POc reconstruction to be made. Etyma with reflexes in both Western and Eastern Oceanic, but not in the Admiralties, would be reconstructed as Proto Nuclear Oceanic. Under the criteria outlined above, however, we attribute these reconstructions to POc. These criteria were used in volumes 1 and 2, and we have thought it wise to maintain them throughout the volumes of this work. The reader who wishes to single out reconstructions attributable to a putative Proto Nuclear Oceanic (rather than to POc) can easily recognise them. They are those POc reconstructions for which (i) there are no Admiralties reflexes, and (ii) there is no higher-order reconstruction (i.e. PEMP, PCEMP, PMP or PAn).

4.4.3. Which protolanguage? Handling linkages⇫

The languages of a linkage have no identifiable exclusively shared parent. Yet we have found many instances in which a cognate set is limited to one of the linkages in Figure 1.1: Western Oceanic, New Guinea Oceanic, Southern Oceanic or the reintegrated North and Central Vanuatu linkage. By the logic of §1.4.3.2 a form reconstructed from a cognate set restricted to a linkage should be reconstructed to the next protolanguage node up the tree. For a Western Oceanic cognate set, for example, this would mean reconstructing it to POc—this would defy the condition that a POc cognate set must be spread over at least two out of the four criterial groupings (§1.4.4.2).

As with PEOc and PROc (§1.4.4.2), we think it is more realistic to attribute these reconstructions to a hypothetical protolanguage rather than to a higher node in the tree. Hence there are reconstructions labelled PWOc and so on. Again these apparent lexical innovations offer only weak evidence for the protolanguage to which they are attributed. In addition to the explanations of the kinds offered above for PEOc and PROc etyma, it is possible, for example, that an innovatory ‘PWOc’ etymon arose when the Western Oceanic dialect network was still close-knit, and spread from dialect to dialect before the network broke into the two networks ancestral to its present-day first-order subgroups.

It is probable that the NNG and PT linkages form a grouping within WOc, separate from MM. We call this grouping the New Guinea Oceanic linkage, and so etyma reflected only in NNG and PT languages are attributed to a weakly supported Proto New Guinea Oceanic (Milke 1958, Pawley 1978), and etyma reflected in either NNG or PT (or both) and in MM are labelled PWOc.

5. Conventions common to the series⇫

5.1. Presentation of reconstructions⇫

Each of the contributions to these volumes concerns a particular POc ‘terminology’. Generally, each contribution begins with an introduction to the issues raised by the reconstruction of its particular terminology, and the rest consists of reconstructed etyma with supporting data and a commentary on matters of meaning and form.

The reconstruction of POc *pale below, abbreviated from Chapter 5, shows how reconstructions and supporting cognate sets are presented. Above it is a superordinate (PMP) reconstruction drawn from published sources. Below it are supporting reflexes. Sometimes a lower-order reconstruction like PMic *fale below is included, either in acknowledgment of others’ work, or because it reflects a significant change in form or meaning.

PMP		*balay	‘public building’ (Blust 1987); ‘unwalled building’ (Waterson 1993)
POc		*pale	‘building for storage or public use, open-sided building, shed’
Adm:	Lou	pal	‘canoe hut’
Adm:	Mussau	ale	‘house’
NNG:	Yabem	ale	‘house’
NNG:	Lukep (Pono)	para	‘yam house’
MM:	Tolai	pal	‘house, room’
MM:	Mono-Alu	hale-hale	‘public building’
SES:	Arosi	hare	‘shed for yams’ (E Arosi); ‘house with side of roof only, made in garden’ (W Arosi)
SES:	Bauro	hare	‘canoe house, men’s house’
SES:	Sa’a	hale	‘yam shed outside a garden’
SES:	Kwaio	fale	‘hut for childbirth’
SES:	Gela	hale	‘house’
NCV:	Raga	vale	‘house, hut, garden house’
NCV:	Nokuku	vale	‘shelter’
NCV:	Nokuku	val-val	‘garden shelter’
PMic		*fale	‘meeting house’ (Bender et al. 2003a)
Mic:	Puluwat	fǣl	‘meeting house’
Mic:	Woleai	fal, fale-	‘men’s house, club house’
Fij:	Bauan	vale	‘house’
Pn:	Samoan	fale	‘house’
Pn:	Hawaiian	hale	‘house’

In putting together cognate sets, we have sometimes found apparent or uncertain reflexes which do not quite ‘fit’ the set: either they display a phonological irregularity or their meaning is just a little too different from the rest of the set for us to assume cognacy. Rather than eliminate them, we often include them below the cognate set under the rubric ‘cf. also’.

Because our supporting data are drawn from such a wide range of languages, the convention is adopted of prefixing each language name with the abbreviation for the genealogical or geographic group to which the language belongs, so that the distribution of a cognate set is more immediately obvious. The abbreviations are:

Yap:	Yapese (one language)
Adm:	Admiralty and Mussau/Tench
NNG:	North New Guinea
SJ:	Sarmi/Jayapura
PT:	Papuan Tip
MM:	Meso-Melanesian
SES:	Southeast Solomonic
TM:	Temotu
NCV:	North/Central Vanuatu
SV:	South Vanuatu
NCal:	New Caledonia and Loyalties
Mic:	Micronesian
Fij:	Fijian and Rotuman
Pn:	Polynesian

We have sought to be consistent in always listing these groups in the same order, but contributors vary in the ordering of languages within groups.

Map 1.4: Groups of Oceanic languages used in cognate sets — **Map 1.4:** Groups of Oceanic languages used in cognate sets

Lynch’s research on Southern Oceanic (§1.4.3.2) renders the NCV group mildly anomalous, although there is no doubt that it reflects an integrated dialect network. There are a number of etyma whose reflexes are confined to North and Central Vanuatu, and so we continue to include ‘Proto North/Central Vanuatu’ reconstructions. These perhaps represent a Southern Oceanic term that has been lost in southern Vanuatu and New Caledonia. Where the distribution of reflexes requires it, the chapters in this volume include reconstructions for PROc and for PSOc. Etyma with these distributions were attributed to PEOc in volumes 1 and 2, but the distributions are transparent, thanks to the presence of the group labels in cognate sets (cf §1.4.4.2).

In the interests of space we do not give the history of the reconstructions themselves, as this would often require commentary on the modifications made by others and by us, and on why we have made them. Where a reconstruction is not new, we have tried to give its earliest source, e.g. ‘Blust 1987’ above, but this is difficult when earlier reconstructions differ in form and meaning and when their sources are not reported.

In general, the contributions to these volumes are concerned with items reconstructable in POc, PWOc, PEOc, PROc and occasionally Proto New Guinea Oceanic (PNGOc). Etyma for PWOc, PNGOc and PEOc are reconstructed because these may well also be POc etyma for which known reflexes are not well distributed (see discussion in §1.4.4). Reconstructions for lower-order interstages are decreasingly likely to reflect POc etyma and may be the results of cultural change as Oceanic speakers moved further out into the Pacific.

Contributors to these volumes have usually not made fresh reconstructions at interstages superordinate to POc. What they have done, however, is to cite other scholars’ reconstructions for higher-order interstages, as these represent a summary of the non-Oceanic evidence in support of a given POc reconstruction. These interstages are shown in Figure 1.5.

Sometimes non-Oceanic evidence has been found to support a POc reconstruction where no reconstruction at a higher-level interstage has previously been made. In this case a new higher-order reconstruction is made, and the non-Oceanic evidence is given in a footnote.

Whilst we have tried to use the internal organisation of the lexicons of Oceanic languages themselves as a guide in setting the boundaries of each terminology, we have inevitably taken decisions which differ from those that others might have made. There are, obviously, overlaps and connections between various semantic domains and therefore between the contributions here. We have done our best to provide cross- references, but we have sometimes duplicated information rather than ask the reader repeatedly to look elsewhere in the book. Indexes at the end of each volume and in the final volume are intended to make it easier to use the volumes collectively as a work of reference.

5.2. Data⇫

Data sources are listed in Appendix A.

For some reconstructed etyma only a representative sample of reflexes is given. We have endeavoured to ensure, however, that in each case this sample not only is geographically and genealogically representative, but also provides evidence to justify the reconstruction’s shape and gloss. Where only a few reflexes are known to us, this is usually noted.

Although there are accepted or standard orthographies for a number of the languages from which data are cited here, all data are transcribed as far as possible into a standard phonemic orthography based on that used by Ross (1988:3–4) in order to facilitate comparison.²¹ This means, for example, that the j of the German-based orthographies of Yabem and Gedaged becomes y, Yabem c becomes ʔ, Gedaged z becomes ɬ and so on; the ng of English-based orthographies becomes ŋ; and Fijian g, q and c become ŋ, g and ð respectively.

The following symbols have more or less their usual IPA (Interenational Phonetic Association) values: ð, ɢ, ɣ, h, k, l, ʟ, ɬ, ʎ, m, n, ŋ, ñ, p, q, χ, ɾ, r, s, t, w, x, z, ʔ, a, æ, e, ɛ, ə, i, ɨ, o, œ, ɔ, ʌ, u, ɯ. As far as possible, however, our orthography is phonemic and does not show allophonic variation, so that there are instances where a symbol does not have its usual phonetic value. For example, Wayan Fijian k is a voiceless stop word-initially but [k] is in free or stylistic variation with [ɣ] word-medially. The voiced stops b, d, g and the voiced bilabial trill ʙ are prenasalised in some languages, but prenasalisation is not written unless it is phonemically distinctive. Where a language has just one rhotic, we usually write r, despite the fact that that rhotic is sometimes a flap. Other orthographic symbols (with values in IPA) are:

f	[ɸ, f]	voiceless bilabial or (less often) labio-dental fricative
v	[β, v]	voiced bilabial or (less often) labio-dental fricative
c	[ts], [ʧ]	voiceless alveolar or palatal affricate
j	[ʣ], [ʤ]	voiced alveolar or palatal affricate
y	[j]	palatal glide
dr	[ⁿr]	prenasalised voiced alveolar trill (as in Fijian)
ö	[ø]	rounded mid front vowel
ü	[y]	rounded high front vowel

Other superscripts and diacritics are as follows:

contrastive long vowels are represented by a macron, e.g. ā;
contrastive vowel nasalisation is represented by a tilde, e.g. ã;
high and low tone are represented respectively by an acute and a grave accent, e.g. é, è;²²
labialisation is marked by a superscript w, e.g. pʷ;
velarisation is marked by a superscript ɯ, e.g. pᵚ;
contrastive aspiration is marked by a superscript h, e.g. pʰ;
contrastive devoicing is marked by a small circle beneath, e.g. n̥;
apicolabials are represented by the corresponding apical symbol and the linguolabial diacritic (the ‘seagull’), e.g. t̼;
retroflexes are represented by the corresponding apical symbol with a dot beneath, e.g. ṛ.

Except for inflexional morphemes, non-cognate portions of reflexes, i.e. derivational morphemes and non-cognate parts of compounds, are shown in parentheses (…). Where an inflexional morpheme is an affix or clitic and can readily be omitted, its omission is indicated by a hyphen at the beginning or end of the base. This applies particularly to possessor suffixes on directly possessed nouns (see §2.2). Where an inflexional morpheme cannot readily be omitted, it is separated from its base by a hyphen. This may happen because of complicated morphophonemics or because the morpheme is always present, like the attributive -n in some NNG and Admiralties languages and prefixed reflexes of the POc article *na in scattered languages. When a reflex is itself polymorphemic (i.e. the morphemes reflect morphemes present in the reconstructed etymon) or contains a reduplication, the morphemes or reduplicates are also separated by a hyphen.

Languages from which data are cited in this volume are listed in Appendix B in their subgroups or linkages, together with an index allowing the reader to find the subgroup to which a given language belongs. Appendix B also includes alternative language names. The difficulty of deciding where the borderline between dialect and language lies, combined with the fact that these volumes contain work by a number of contributors, has resulted in some inconsistency in the way dialects are labelled in cognate sets. Some occur in the form ‘Lukep (Pono)’, i.e. the Pono dialect of the Lukep language, whilst others are represented simply by the dialect name, e.g. Iduna, noted in Appendix B as ‘Iduna (= dialect of Bwaidoga)’.

5.3. Conventions used in representing reconstructions⇫

Reconstructions are marked with an asterisk, e.g. *Rumaq ‘dwelling house’, in keeping with the standard convention in historical linguistics. POc reconstructions, and also PWOc and PNGOc reconstructions, are given in the orthography of §1.7. For reconstructions at higher-order interstages the orthographies are those used by Blust in his various publications and the ACD. Reconstructions at lower-order interstages are given in the standard orthography adopted for data (§4.2). Geraghty’s (1986) PCP orthography, for example, is based on Standard Fijian spelling, and is converted into our standard orthography in the same way as Fijian spelling is. In practice, this means that the orthographies for PEOc, PROc and PCP are the same as for POc, except that a distinction between *p and *v is recognised and *R is generally absent from PCP.²³ Biggs and Clark’s PPn reconstructions are in any case written in an orthography identical to our standard. Bracketing and segmentation conventions in protoforms are shown in Table 1.1.

PMP final consonants are usually retained in POc in absolute word-final position. In many cases decisive evidence for retention or loss can be found in those Oceanic languages that usually retain final consonants. However, there are some cases where it is uncertain whether POc kept a PMP final, as when a PMP etymon is not attested in an Oceanic language that consistently retains POc final consonants. An example is *-d in PMP *palahud ‘go down to the sea or coast’, a term reflected in Oceanic only inlanguages that regularly lose POc final consonants. In such cases the consonant is reconstructed in parentheses, e.g. POc *palau(r) ‘go to sea, make a sea voyage’.

**Table 1.1.** Bracketing and segmentation conventions in protoforms
(x)	it cannot be determined whether x was present
(x,y)	either x or y was present
[x]	the item is reconstructable in two forms, one with and one without x
[x,y]	the item is reconstructable in two forms, one with x and one with y
x-y	x and y are separate morphemes
x-	x takes an enclitic or a suffix
⟨x⟩	x is an infix

In presenting words that display anomalies of form, it is often necessary to posit an expected form. For example, in §14.6.5.1, the Banoni term raus ‘100’ is accompanied by the note “metathesis of †rasu”, i.e. ‘metathesis of expected rasu’. In this volume we use a less widely employed convention and mark expected forms with a dagger, to distinguish an expected form both from reconstructions and real data.²⁴ Sometimes we need to refer to a reconstructed form that one would expect as the regular reflex of an established POc etymon, but which does not occur because an irregular sound change has occurred. In such cases the dagger and asterisk conventions are used together. For example, in vol.5:99, we reconstruct PNCV *kaRo ‘vine, rope; vein’. It is descended, however, from POc *waRo(c) ‘vine, creeper; string, rope; vein, tendon’, and the expected PNCV form, referred to in our discussion there, would be †*waRo. The dagger marks it as expected but unattested.

When historical linguists compile cognate sets they commonly retain word for word the glosses given in the sources from which the items are taken. However, again in the interests of standardisation, we have often reworded (and sometimes abbreviated) the glosses of our sources, while preserving the meaning. Where glosses were in a language other than English we have translated them. In the interests of space and legibility, and because data often have multiple sources, we have given the source of a reflex only when it is not included in the listings in Appendix A.

Sometimes our authors use the convention of providing no gloss beside the items in a cognate set whose gloss is identical to that of the POc (or other lower-order) reconstruction at the head of the set, i.e. the reconstruction which they reflect. Where necessary, we use ‘(N)’ to indicate that a gloss is a noun, and ‘(V)’, ‘(VI)’, ‘(VT)’ or ‘(VSt)’ to indicate that it is a verb, intransitive verb, transitive verb or stative verb. Because in many environments transitive verbs were regularly formed from the intransitive stem by adding the suffix *-i- (vol.5:24), in many cases the intransitive and transitive verbs are simply shown in sequence, e.g. POc *qalo(p), *qalop-i- ‘beckon with the palm downward, wave’. In such cases, the first verb is always intransitive, the second (in *-i-) transitive.

Within glosses we use the conventional abbreviations ‘k.o.’ (as in ‘k.o. yam’) for ‘kind of’, ‘s.o.’ for ‘someone’ and ‘s.t.’ for ‘something’.

**Table 1.2.** POc consonants used in reconstructions in the six volumes of this work.
	labialised bilabial	bilabial	dental	alveolar	palatal	velar	labialised velar	uvular
stop voiceless	*pʷ	*p	*t		*c	*k	*kʷ	*q
stop voiced	*bʷ	*b		*d	*j	*g
trill				*r
prenasalised trill				*dr
nasal	*mʷ	*m		*n	*ñ	*ŋ
fricative				*s
lateral				*l
approximant	*w				*y

6. Proto Oceanic bound morphology⇫

Proto Oceanic bound morphology is not discussed in this volume, other than in §2.2, as the use of possessor suffixes with inalienably possessed nouns plays a role in reconstructions in chapter 2.

An account of aspects of POc morphology, especially verbal derivational morphology, is given in vol.5:21–26, where it is followed by some comments on the fossilisation of earlier morphology in POc forms (vol.5:26–30).

7. Proto Oceanic phonology and orthography⇫

Work based on the sound correspondences of both Oceanic and non-Oceanic languages has resulted in the reconstructed paradigm of POc consonants shown in Table 1.2. A number of Oceanic (and non-Oceanic) languages attest to the facts that *t was dental, *d alveolar. This is significant in the prehistory of POc discussed below (§1.8.2.3). The POc vowels that occur in our reconstructions are *i, *e, *a, *o, *u.

In the light of recent work it is likely that both the consonant and vowel sets require some revision. We return to this in sections 1.8.2 and 1.8.3.

Lynch (2000a) concludes that POc stress fell on the penultimate mora. Each vowel counted as one mora, and so did the final consonant if there was one. Hence the stress of a word that ended in a vowel like *ku̱tu̱ ‘head louse’ (a mora is indicated by an underscore) fell on its penultimate syllable: *kútu. The stress of a word that had a final consonant, like *ma̱nu̱ḵ ‘bird’, fell on the final syllable: *manúk. Note that an inalienably possessed noun (§2.2) took a possessor suffix, and that this must have resulted in stress shift: *máta ‘eye’, but *matá-gu ‘my eye’. Inalienably possessed nouns are marked with a final hyphen in our reconstructions: *mata- ‘eye’.

**Table 1.3.** POc orthographies after Grace (1969) and Ross (1988)
Grace	oral grade	*p	—	*t	d/r	*s		*j	*k	—
Ross	oral grade	*p	*pʷ	*t	*r	*s		*c	*k	*kʷ
Grace	nasal grade	*mp	ŋp/mpw	*nt	*nd		*nj		*ŋk
Ross	nasal grade	*b	*bʷ	*d	dr/nr		*j		*g

Grace

*ŋm/*mw

*ñ

*ŋ

Ross

*mʷ

*ñ

*ŋ

Table 1.3 shows two POc orthographies. The first was established by Biggs (1965) for PEOc and applied to POc by Grace (1969). It was used with a number of variants, separated by a slash in Table 1.3. The second orthography, used here and in the POc reconstructions in these volumes is from Ross (1988, 1989b), with the addition of *pʷ (introduced without comment by Blust 1984) and *kʷ (Ross 2011). The terms “oral grade” and “nasal grade” belong to the terminology of Oceanic historical phonology (§1.8.1 and §1.8.2).

8. The phonological prehistory of Proto Oceanic⇫

In section 1 we expressed the hope that the material would be a rich source of data for historical linguistics. Section 1.8.2 and its subsections, along with §1.9, report on research based on the POc reconstructions in volumes 1–5. First, however, we recapitulate the currently conventional view of POc phonology.

The widely accepted hypothesis about the provenance of Proto Oceanic is shown in Figure 1.5. It is due to Robert Blust, originally presented in Blust (1977b) and repeated with modifications and accumulated supporting evidence in subsequent publications (Blust 1978a, 1982, 1983–84b, 1993, 2009b). New research based on the reconstructions in volumes 1–5 and summarised in §1.8.2 and its subsections, §1.9.1 proposes that this hypothesis—we will call it the “accepted hypothesis”—should be retired. The fresh research confronts us with the need to reassess the part of the tree that is headed by Proto Central/Eastern Malayo-Polynesian. This leads to a re-evaluation in §1.9.3 of where Proto Oceanic came from.

The conventions used in Figure 1.5 are those outlined in §1.4.3.1 for Figure 1.1. Thus Formosan languages in Figure 1.5 indicates a collection of languages descended (along with PMP) from PAn (Blust 1999). They are spoken in Taiwan, but do not form a subgroup. There was no ’“Proto Formosan”, as Formosan languages and language groups are all descended directly from PAn. Despite references to “Proto Western Malayo-Polynesian”, Western Malayo-Polynesian languages have never been seriously considered a subgroup of Austronesian (Ross 1995b; Adelaar 2004). Smith (2017) provides a set of hypotheses about the groups that make up WMP.²⁵ Their common ancestor is PMP. Recent years have seen renewed research into the Central Malayo-Polynesian languages and those of South Halmahera/West New Guinea, and we turn to this in §1.9.

Figure 1.5: Schematic diagram showing the widely accepted genealogy of the Austronesian family — **Figure 1.5:** Schematic diagram showing the widely accepted genealogy of the Austronesian family

8.1. The Proto Austronesian and Proto Malayo-Polynesian antecedents of Proto Oceanic phonology⇫

First, though, it is noteworthy that much research on the prehistory of the POc lexicon has focussed on phonological changes that occurred between PMP and POc. This is because PMP and POc are protolanguages clearly defined by shared innovations, the bedrock of the linguistic comparative method, whereas Blust’s two proposed interstages, PCEMP and PEMP (Blust 1978), are only weakly defined.

We give here a conventional account of POc innovations, before revising this history in §1.8.2 in the light of research based on the reconstructions in volumes 1–5.

Map 1.5: The Austronesian language family and the major subgroups according to the standard hypothesis — **Map 1.5:** The Austronesian language family and the major subgroups according to the standard hypothesis

: Table 1.4. Correspondences between PMP and POc protophonemes as currently understood. Shadings are explained in §1.8.2

PAn		p, b	—	t, C	d, r	s, z	*j	k, g	—
PMP		p, b	—	*t	d, r	s, z	*j	k, g	—
POc	oral grade:	*p	*pʷ	*t	*r	*s	*c	*k	*kʷ
POc	nasal grade:	*b	*bʷ	*d	*dr	*j		*g	—

PAn	*m	—	n, -L(-)	*ñ	*ŋ	*w	*y	l, L-	*q	*R	*S
PMP	*m	—	*n	*ñ	*ŋ	*w	*y	*l	*q	*R	*h
POc	*m	*mʷ	*n	*ñ	*ŋ	*w	*y	*l	*q	*R	*∅

PAn/PMP	i, -uy(-)	e [ə], -aw	*-ay	*a	*u
POc	*i	*o	*e	*a	*u

The Oceanic subgroup is defined by a set of shared innovations relative to PMP. It was on the basis of some of these that Dempwolff (1927, 1937) first recognised his Urmelanesisch (‘Proto Melanesian’) as a major Austronesian subgroup. In the 1937 work he also recognised that Polynesian languages shared the innovations of Urmelanesisch, and so the concept of an Oceanic subgroup entered the literature. However, naming it took a while. Grace (1955) defined the borders of the new subgroup and called it “Eastern Malayo-Polynesian”.²⁶

Meanwhile, Milke (1958) made frequent reference to ozeanisch-austronesische Sprachen (‘Oceanic-Austronesian languages’) and in 1961 finally adopted the terms ozeanische Sprachen and proto-Ozeanisch (‘Oceanic languages’, ‘Proto Oceanic’), which were soon adopted by his colleagues.

Correspondences between PAn, PMP and POc protophonemes are shown in Table 1.4. PAn protophonemes are shown for reference, as the volumes of this work cite PAn reconstructions fairly often.

Certain POc innovations exclusive to Oceanic languages are immediately visible in the form of a number of mergers and splits, highlighted in colour in Table 1.4.

The PMP voiced/voiceless pairs *p/*b, *k/*g and *s/*z and the PMP pair *d/*r each merged respectively as *p, *k, *s and *r in an interstage that we label ‘Proto X’
Proto X *p, *k, *s and *r then split to give POc “oral-grade” *p, *k, *s and *r and “nasal-grade” *b, *g, *j and *dr (the “grade” terms are explained in §1.8.2). Although *t did not participate in the merger in (a), *t did participate in the split, with POc oral-grade *t and nasal-grade *d.
A small complication is that PMP *j did not participate in the merger in (a), but did participate in the split in (b), its POc nasal grade merging with that of *s.

Ozanne-Rivierre (1992) suggests that the corresponding *t/*d merger was hindered by the mismatch in point of articulation between dental *t and alveolar *d, a mismatch attested in many non-Oceanic Austronesian languages.

Table 1.5 is a corrected and expanded version of the table in Blust (2013:599) showing examples of PMP reconstructions and their POc continuations. It illustrates the combined effect of (a) and (b): each of the PMP pairs *p/*b, *k/*g, *s/*z and *d/*r first merged and then split. The set of changes in (a) and (b) alone is unusual enough to be strong evidence for the integrity of the Oceanic subgroup.

**Table 1.5.** Examples of PMP reconstructions and their POc continuations showing the effects of the mergers and splits giving rise to POc consonant grade
segment	PMP	POc	grade	gloss
*p-	pitu	pitu	oral	seven
*p-	punay	bune	nasal	pigeon
*-p-	hapuy	api	oral	fire
*-mp-	t-umpu	tubu	nasal	ancestor
*b-	bulan	pulan	oral	moon
*b-	beRek	boRok	nasal	pig
*-b-	qabu	qapu	oral	ashes
*-mb-	ambit	abit	nasal	hold in hand
*t-	taqun	taqun	oral	year
*t-	—	—	(nasal)	—
*-t-	qutin	qutin	oral	penis
*-nt-	-nta	-da	nasal	P:1INC.PL
*-nt-	punti	pudi	nasal	banana
*d-	duha	rua	oral	two
*d-	daRaq	draRaq	nasal	fresh water
*-d-	kuden	kuron	oral	cooking pot
*-nd-	pandan	padran	nasal	pandanus
*s-s-	susu	susu	oral	breast
*s-	siRi	jiRi	nasal	a shrub: Cordyline
*-s-	ŋusuq	ŋuju-	nasal	lips, snout, beak
*z-	zaqat	saqat	oral	bad
*-z-	quzan	qusan	oral	rain
*z-	zalan	jalan	nasal	path, road
*-z-	tazim	tajim	nasal	sharp
*k-	kali	kali	oral	dig
*k-	kumuR	gumu	nasal	gargle, rinse mouth
*-k-	seka	soka	oral	pierce, stab
*-ŋk-	laŋkaw	lago	nasal	tall, long
*g-	gaway	kawe	oral	octopus tentacle
*g-	gemgem	gogom	nasal	hold in fist
*-g-	liget	likot	oral	turn, rotate
*-g-	—	—	(nasal)	—

Another set of innovations is the introduction of the labiovelars *pʷ, *bʷ, *mʷ and *kʷ into Proto Oceanic (Blust 1981a; Lynch 2002e; Ross 2011). Many items containing a labiovelar lack non-Oceanic cognates, and some, at least, must have been borrowed into POc from neighbouring Papuan languages. For example, *mʷapo(q) ‘taro’ was apparently borrowed by POc speakers as they copied taro-growing techniques from Papuan speakers (vol.3,267). In some inherited items a labial became a labiovelar next to a round vowel, but it is not clear whether the labiovelar actually occurred in POc. Thus a number of Oceanic languages reflect *tamʷata ‘man, husband’, derived from *tau ‘body, person’ + *mataq ‘unripe, immature, young’, but we cannot be sure whether *tamʷata or *taumata(q) was the POc form (vol.5:43–44).

Collectively, innovations affecting the vowels are also exclusive to Oceanic, although individually each of them occurs in various non-Oceanic languages:

PMP *e, phonetically [ə], became POc *o.
PMP word-final diphthongs *-uy(-), *-aw and *-ay were simplified to POc *-i, *-o and *-e respectively, the first two thereby merging with plain vowels.²⁷

A further innovation that has come to light during work on these volumes concerns certain PMP trisyllabic roots with *-e- (*[ə]) as the nucleus of their penultimate syllable. These trisyllables lost *-e- in POc, along with one consonant of the resulting consonant cluster:

PMP *biseqak	POc pisa(k)~pisak-i- ‘split’ (vol.1:261)
PMP *ma-udehi	POc *muri ‘be behind’ (vol.2:251)
PMP *tuqelan	POc *tuqan ‘bone’ (vol.5:85)
PMP *baReqaŋ	POc *paRa(ŋ) ‘molar tooth’ (vol.5:133),
PMP *buteliR	POc *putiR ‘wart’ (vol.5:344).
PMP *buqeni	POc puni ‘ringworm, Tinea imbricata*’ (vol.5:346)
PMP *ma-heyaq	POc *maya(q) ‘shy, embarrassed; ashamed’ (vol.5:585).

The conditioning of this change remains unclear, as it did not affect the etyma below:

PMP *maqesak	POc *maosak ‘ripe, cooked’ (vol.1:157),
PMP *baqeRu	POc *paqoRu ‘new’ (vol.2:203),
PMP *qateluR	POc *qatoluR ‘egg’ (vol.4:278)
PMP *qulej-an	POc *quloc-a(n) ‘maggoty’ (vol.4:415).

PMP *qalejaw/POc *qaco ‘daylight, sun’ (vol.2,153–155) appears exceptionally to have lost the first consonant of the cluster, but there is evidence that a PAn variant *qajaw was ancestral to POc *qaco.

8.2. Reinterpreting the origins and distribution of POc oral- and nasal-grade consonants⇫

This section presents a revision of the history sketched in §1.8.1, as promised there.

Figure 1.6 diagrams three accounts of the history of POc *p and *b. In the first two accounts ‘(N)’, “nasal grade”, implies that POc *b reflected an earlier nasal + obstruent sequence (*mp, *mb) and was perhaps prenasalised (POc *[ᵐb]). The terms “oral grade” and “nasal grade” were coined by Grace (1959:27) to refer to the pairs of POc obstruents that had been recognised by Dempwolff (1927).

Figure 1.6: Three analyses of the phonological history of POc *p and *b — **Figure 1.6:** Three analyses of the phonological history of POc *p and *b

Dempwolff inferred that PMP *p and *b, for example, merged as POc *p, while PMP *mp and *mb merged as POc *b.²⁸ He made parallel assumptions about PMP *k/*g versus PMP *ŋk/*ŋg, and PMP *s/*z/*j versus PMP *ns/*nz/*nj.²⁹ He also assumed that, e.g., PMP *p and *mp, or *b and *mb, were in free variation and that they became fossilised randomly in each Oceanic daughter-language, such that a word might begin with a reflex of *p in one daughter-language but a reflex of *mp in another.

Despite the obvious improbability of this assumption and the frequent discussions of consonant grade, reviewed by Grace (1990), the randomness assumption was maintained in some form until the publication of Ross (1988).³⁰ The latter found that in the vast majority of POc etyma with one or more “graded” consonants, the grade of each consonant can be reconstructed unambiguously because its Oceanic reflexes agree in grade, a finding supported by the cognate sets in the present work. The illusion of randomness had two sources. First, although Milke (1968) had correctly identified POc *j (his *nj) as the nasal-grade consonant paired with oral-grade *s, most scholars assumed that various lenited reflexes of *s reflected the nasal grade, so that the pair of *s grades seemed almost chaotic (Ross 1988:71–93; 1989b). Second, various regular local processes such as Admiralties secondary nasal grade (Ross 1988:337–341) and Eastern Fijian apical prenasalisation (Geraghty 1983:74–96) had masked consonant grade in some languages.

The fact that consonant grade can be reconstructed without ambiguity in most POc etyma largely rids POc of Dempwolff’s posited randomness, but, as the middle panel in Figure 1.6 indicates, PMP *p and *b must have merged as Proto X *p, which then split into POc *p and *b. Similar processes applied to PMP *k/*g and *s/*z/*j. This is the position adopted in the introductions to volumes 1 to 5 of this work. Ross (1988) retained the assumption that the POc voiced obstruents were “nasal grade”, i.e. reflected nasal + obstruent sequences. He attempted unsatisfactorily to explain the splits as the effects of derivational morphology (Reid 2000).

This still leaves two questions about the origin of POc consonant grade unanswered:

How did the POc splits come about?
Do POc “nasal-grade” consonants have a nasal origin?

As a result of new research based on the POc reconstructions in volumes 1–5, we have a partial answer to (a) and a definitive answer to (b), shown in the righthand panel of Figure 1.6. Following Proto X (§1.8.1), this panel shows two further interstages, “ePOc” and POc. “POc” denotes the language reconstructed in these volumes, equated with its state immediately before its break-up into daughter-languages (Pawley 2008); and “ePOc” denotes “early POc”, a stage sometime before POc, but after its speakers settled in the Bismarck Archipelago.

Comparing reconstructions in previous volumes with their ancestral PMP forms in the acd, we find that ePOc had three grades of obstruent: voiceless, voiced and prenasalised. Its voiceless obstruents are Grace’s oral-grade segments, but a majority of his “nasal-grade” segments reflect plain voiced obstruents. The prenasalised obstruents are true nasal-grade obstruents, reflecting inherited nasal + obstruent clusters. They may be inherited from PMP or from a more recent ancestor. This is the situation depicted in the righthand diagram of Figure 1.6, where the grey of the prenasalised obstruents indicates their rarity.

8.2.1. The POc voiceless and voiced obstruents⇫

Our database of POc reconstructions from volumes 1–5, along with their PMP ancestral forms (drawn directly from the ACD), contains 729 etyma.³¹ In total these reconstructions contain 429 initial and medial instances of the the PMP obstruents listed in the leftmost column of Table 1.6. The columns headed ‘> POc’ show the voiceless and voiced outcomes of the PMP phonemes (prenasalised ePOc outcomes are discussed in the next subsection). To the right of each POc obstruent in Table 1.6 are shown its number of instances as an absolute figure and as a percentage of the PMP obstruent in the leftmost column.

**Table 1.6.** Instances of PMP obstruents and their POc voiceless and voiced reflexes
		POc voiceless reflexes			POc voiced reflexes
PMP	total	> POc	total	%	> POc	total	%
*p	94	*p	82	87.2	*b	12	12.8
*b	128	*p	101	78.9	*b	27	21.1
*s	75	*s	69	92.0	*j	6	8.0
*z	14	*s	10	71.4	*j	4	28.6
*-j-	17	*-c-	13	76.5	*-j-	4	23.5
*k	93	*k	91	97.8	*g	2	2.2
*g	8	*k	8	100.0	(g)*	0
*C	429	*C_voiceless	374	87.2	*C_voiced	55	12.8

The table tells a somewhat unexpected story. Only 13 per cent of the instances of PMP obstruents end up as POc voiced obstruents. It is also unclear whether Proto X *k actually split into POc *k and *g. PMP *p/*b, *k/*g and *s/*z each evidently merged as the Proto X phonemes *p, *k and *s. Proto X *p and *s then split into POc *p/*b and *s/*j respectively. If Proto X *k split, the outcome is inconsequential. Only eight instances of PMP *g occur in the first place, against 93 instances of PMP *k. No instances of PMP *g end up as POc *g, and just two instances of PMP *k do so.

As noted above, PMP *t (129 instances) did not participate in these processes and is always reflected as POc *t. PMP *r, with 27 instances, is omitted from the table because all its POc outcomes are *r. PMP *d probably underwent a split, but the split was in prenasalisation, not in voicing (§1.8.2.3).

8.2.2. The POc prenasalised obstruents⇫

POc reflexes of PMP nasal + obstruent clusters are omitted from Table 1.6, as the numbers of reflexes are generally few and would skew the table’s percentages. Instead, POc reflexes of these PMP clusters are shown separately in Table 1.7. The instances are all in etyma drawn from the ACD (and found among the POc reconstructions in volumes 1-5). Instances of nasal + obstruent clusters that arose sometime between the break-up of PMP and the break-up of POc are not shown in Table 1.7, as they would obscure the relationship between PMP and POc.

PMP nasal + obstruent clusters are reflected as POc unitary phonemes. In fact their POc outcomes appear to be the same as those of PMP voiceless and voiced obstruents in Table 1.6, but we argue below in §1.8.2.4 that this is incorrect, and reconstruct ePOc prenasalised rather than voiced outcomes in Table 1.7. The PMP clusters are shown in the table as *-Np-/*-Nb- etc as there are instances where the cluster is not homorganic. Some are the result of reduplication of a monosyllable, e.g., PAn/PMP *demdem ‘dark, gloomy, overcast’, attested with -md- in Formosan and many Philippine reflexes (ACD), but becoming *dendem at some intermediate interstage and thence POc *rodrom (vol.2:308). POc *-dr- is a unitary phoneme reflecting earlier *-nd- (PCEMP *-nd- according to Blust 1977a).

**Table 1.7.** Instances of PMP nasal + obstruent clusters and their POc reflexes
		POc voiceless reflexes		ePOc prenasalised reflexes
PMP	total	> POc	total	> ePOc	total
*-Np-	4	*-p-	2	*-ᵐb-	2
*-Nb-	12	*-p-	5	*-ᵐb-	7
*-Nk-	13	*-k-	8	*-ᵑg-	5
*-Ng-	2	*-k-	0	*-ᵑg-	2
*-Nt-	6	*-t-	3	*-ⁿd-	3
*-Nd-	3	*-r-	1	*-ⁿr-	2
*-Ns-	2	*-s-	1	*-ñj-	1
*-Nz-	1	*-s-	1	*-ñj-	0
totals	44		21		22

Blust (2022) shows that homorganic nasal + obstruent clusters were present in PMP, but were rare, as Table 1.7 confirms. Their very rarity has meant that scholars have paid little attention to them as a discrete category (Collins 1983 and Mills 1991 are exceptions). Further, reconstructions in the ACD for PCEMP, the next node below PMP in Blust’s tree (Figure 1.5), show little sign of acquiring nasal + obstruent clusters, other than those resulting from reduplications.

The ACD includes just four PCEMP items which contain nasal + obstruent clusters and have no cognates outside CEMP. They are:³²

PCEMP *tambu	POc *tabu	‘forbidden, taboo’ (this volume, chapter 10)
PCEMP *kandoRa	POc *kadroRa	‘cuscus’ (vol.4:225)
PCEMP *waŋka	POc *waga	‘canoe’ (vol.1:178)
PCEMP *mans[ə,a]r	POc *mʷaja(r,R)	‘bandicoot’ (vol.4:228)

Table 1.5 illustrates the fact that voiced and prenasalised obstruents are conventionally treated as a single—nasal-grade—POc category, as their reflexes in almost all Oceanic languages are identical. Of the POc medial nasal-grade items in that table, those reflecting PMP *t-umpu, *ambit, *-nta, *punti, *pandan and *laŋkaw ancestrally had a nasal + obstruent cluster, while those reflecting *ŋusuq and *tazim did not. Only 22 POc “nasal-grade” consonants in our database were descended from nasal + obstruent clusters (Table 1.7). Fifty-five reflect PMP plain voiceless or voiced obstruents (Table 1.6).

Table 1.6 allows us finally to understand where POc voiced initial consonants came from. Ever since Dempwolff (1927) the default assumption has been that they reflected nasal + obstruent clusters, with scholars trying—and failing—to find grounds to reconstruct ancestral initial nasal + obstruent clusters (Milner 1965; Ross 1988:39–43; Grace 1990; Reid 2000). The reason for the failure is now evident: POc initial “nasal- grade” obstruents actually reflect PMP plain voiceless or voiced obstruents (Table 1.6). PMP nasal + obstruent clusters were always medial (Table 1.7). They never occurred initially.

8.2.3. PMP t, d and *r⇫

We have seen that PMP *t and *d did not form a voiceless/voiced pair, as they had different points of articulation.

With regard to PMP *t, there is a mismatch between the findings reported in Table 1.6 and Table 1.7. The former reports that PMP *t did not undergo the merger-and-split sequence that affected PMP *p and *s, and therefore did not give rise to POc “nasal-grade” (voiced) reflexes. Hence PMP initial *t is never reflected as POc *d. But Table 1.7 reports three POc etyma reflecting PMP *-nt-, namely the P:1INC.PL suffix *-ⁿda (< PMP *-nta < *=ni-ta; Blust 1977b), *puⁿdi ‘banana’ (< PMP *punti) and *maⁿdala(q) ‘the morning star’ (< PMP *mantalaq-). This was the sole source of “nasal-grade” reflexes of *t, and the overall rarity of earlier nasal + obstruent sequences explains why POc has so few reflexes of *-nt-.

POc *r and *dr, outcomes of the split of PMP/Proto X *d, have conventionally been treated as one of the POc oral-/nasal-grade phoneme pairs (§1.8.2.1). Within the earlier framework this characterisation was correct, as the POc phonological contrast was evidently *[r] vs *[n(d)r].³³ However, we have above recast the conventional POc oral-/nasal-grade pairings as voiceless/voiced pairings. But the feature that distinguishes *dr from *r is prenasalisation, not voicing, so it does not belong to this pair set.

Our database has 40 instances of PMP *d, of which 33 are reflected as POc *r and seven as POc *dr. PMP *r, with 27 instances, is omitted from Table 1.6 because all its POc outcomes are *r. At some point the *r reflexes of PMP *d and *r merged as POc *r.

8.2.4. More evidence for POc prenasalised obstruents⇫

In most Oceanic languages the proposed POc voiced (§1.8.2.1) and prenasalised (§1.8.2.2) phonemes at each point of articulation have merged. The evidence that they were once separate is based primarily on the different sources of each and on the fact that the theory accounts neatly for the relative rarity of reflexes of PMP *-nt-. Had they already merged in POc? In this section we propose that they had not, because there is evidence from five Western Oceanic languages that the distinction between voiced and prenasalised obstruents posited for ePOc was retained in POc.

We know of five Western Oceanic languages that contrast voiceless, plain voiced and prenasalised voiced obstruents. They are Mangap (now better known as Mbula), Sio, Tami, Numbami and Sudest. The only close examination of the contrasts that persist in one of these languages is Bradshaw (1978) on Numbami. The first four languages are located in the area of greatest diversity within the North New Guinea cluster, and are not especially closely related, making them possible candidates for retaining an ancient feature. Sudest is a Papuan Tip language. Contra Ross (1988:192) the immediate ancestor of Sudest and Nimowa now appears to have been the first language to break away from the rest of the early Papuan Tip family, making Sudest another candidate for ancient retentions.³⁴ We refer to these five languages as the “distinction-retaining languages”.

The obstruent series in the distinction-retaining languages are:

Mangap
p	t	k
b	d	g
ᵐb	ⁿd	ᵑg

Sio
pʷ	p	t	k
bʷ	b	d	g
ᵐbʷ	ᵐb	ⁿd	ᵑg

Tami
pʷ	p	t	s	k	kʷ
bʷ	b	d	j	g	gʷ
ᵐbʷ	ᵐb	ⁿd	nj	ᵑg	ᵑgʷ

Numbami
p	t	s	k
b	d	z	g
-ᵐb-	-ⁿd-	-ⁿz-	-ᵑg-

Sudest
pʷ	p	t	s	k	kʷ
bʷ	b	d	j	g	gʷ
ᵐbʷ	ᵐb	ⁿd	nj	ᵑg	ᵑgʷ

A preliminary search for cognate sets reflecting POc etyma that include prenasalised consonants reveals an interesting pattern. A small group of etyma is almost always reflected with the prenasalised consonant intact, while a larger collection of etyma is reflected unpredictably with a mixture of plain voiced and prenasalised voiced reflexes. This larger collection suggests that in these items, plain and prenasalised consonants are gradually falling together into a single category. The membership of the small group of cognate sets is significant, as its members include some sets that reflect POc etyma that on independent evidence contained prenasalised obstruents in PMP or PCEMP.

Thus Blust (1977b) reconstructs PMP possessor suffixes that were prenasalised because they consisted of the morph ni + pronoun. They retain their prenasalised obstruents in ePOc:

*-ᵑgu	P:1SG	< PMP -ŋku < =ni-ku
*-ⁿda	P:1INC.PL	< PMP -nta < =ni-ta
*-dra	P:3PL	< PMP -nda < =ni-da

The first two of these are reflected in the distinction-retaining languages. The P:3PL suffix was replaced by PWOc *-dri.³⁵ At some point prenasalisation has been copied onto this etymon.

	P:1SG	P:1INC.PL	P:3PL
PMP	*-ŋku	*-nta	*-nda
ePOc	*-ᵑgu	*-ⁿda	*-ⁿra
POc	*-gu	*-da	-dra, PWOc -dri
Mangap	-ŋ	-ndV	-n
Sio	-ŋgu	-nda	-nzi
Tami	-ŋ	-n	-n
Numbami	-ŋgi	-ndi	-ndi
Sudest	-ŋgu	-nda	-nji

Further etyma with independent evidence of PMP or PCEMP prenasalised obstruents and reflected in the distinction-retaining languages are given below. A few comments are necessary. The blanks represent cases where, as far as we know, the etymon is not reflected in the relevant language. This pattern reflects the level of lexical replacement in Oceanic languages around the coasts of New Guinea.

	‘pandanus’	‘sago’	‘canoe’	‘betelnut’	‘banana’
PMP/PCEMP	*paŋdan	*R(a,u)mbia	*waŋka	*buaq	*punti
ePOc	*paⁿran	*Raᵐbia	waᵑga	*ᵐbuaq	puⁿdi
POc	*padran	*Rabia	*waga	*buaq	*pudi
Mangap	pānda	—	wōŋgo	mbu	pin
Sio	ponda	rambia	woŋga	—	—
Tami	—	lambi	waŋ	mbu	pun
Numbami	—	—	waŋga	buwa	undi
Sudest	—	mbi	waŋga	—	—

PMP *paŋdan acquired its nasal + stop sequence by losing *-u- from PAn *paŋudaN, leaving no doubt that the POc form had a prenasalised consonant. The evidence for the other forms above is less pressing, but they all have so many WMP reflexes with a nasal + stop cluster that one can be confident that the PMP or PCEMP form had the cluster, which was inherited into ePOc as a prenasalised obstruent (*waŋga is PCEMP). This is true of *punti, but if the argument about PMP *-nt- in §1.8.2.3 holds, then the POc form can only be a prenasalised stop.

POc *ᵐbuaq appears to be unique in having a prenasalised initial. The story of this form is difficult to reconstruct. According to the ACD’s version, PAn *buaq continued until POc, where it split into oral-grade-initial *puaq ‘fruit (including betelnut)’ and nasal-grade-initial *buaq ‘betelnut’. The mechanism of the split is unknown, but evidence shows that it occurred earlier than POc, as it is reflected in some Wallacean languages.³⁶

These cognate sets attest to the presence of ePOc *ᵐb, *ⁿd and *ᵑg in addition to the consonants in Table 1.2. Given that this preliminary search in distinction-retaining languages was confined to the 200-word lists in the Austronesian Basic Vocabulary Database (Greenhill et al. 2008) with some small additions from single-language sources,³⁷ the result is quite telling.

How do we account for the data from the distinction-retaining languages, four belonging to NNG, one to PT? More research is needed, but the account with the best fit says that they retain a distinction that was present in early POc, but lost in the vast majority of its daughter-languages. This represents drift, i.e. independent parallel innovation, probably due to the paucity of lexical items containing a prenasalised obstruent. Because almost all Oceanic languages lack the distinction between plain voiced and prenasalised voiced obstruents, researchers, including ourselves, have reconstructed POc without it. But since a few WOc languages retained the distinction at the time POc broke up, it should be reconstructed for POc.That is, “ePOc” and “POc” in the righthand panel of Figure 1.6 need to be recalibrated. “ePOc” is the real Proto Oceanic, and “POc” reflects the merger that by the time of its break-up had probably occurred in the dialects ancestral to all non-WOc languages, and in many WOc dialects too.

8.3. Revising the history of Proto Oceanic vowels?⇫

Lynch (2022) argues entirely on the basis of Oceanic evidence that the POc vowel system was not the neat conventionally accepted five-vowel system shown in §1.8.1, but a system partway between the PMP four-vowel system of *i, *e [ə], *a, *u and the five-vowel system that emerged later in most Oceanic languages. We showed in §1.8.1 that in the conventional view the sources of POc vowels were as follows:

POc *i < PMP *i, *-uy(-)
POc *u < PMP *u
POc *a < PMP *a
POc *-e < PMP *-ay
POc *o < PMP *ə, -aw

Lynch suggests that the POc system of non-final vowels (i.e. discounting POc *-ay and *-o from PMP *-ay and *-aw) was one of the following three:

(A)	*i		*u
		*ə
		*a

(B)	*i		*u
		*ə	*o
		*a

(C)	*i		*u
	*e	*ə	*o
		*a

Lynch’s revision suggests no change to the origins of high *i and *u or low *a. It is the mid vowels that changed, but he is uncertain when. His system A infers that there had been no change in the PMP system by the time POc broke up. Systems B and C both assume that PMP *ə was in the process of becoming *o when POc dispersed, and C assumes that *ə also became POc *e under certain conditioning.

9. Where did Proto Oceanic come from?⇫

The conventional answer to the question, “Where did Proto Oceanic come from?”, is the accepted hypothesis in Figure 1.5. It says that POc is the sibling of PSHWNG, and the two are the only children of PEMP (Blust 1978). PEMP in its turn is a sibling of the CMP languages, and they are all children of PCEMP (Blust 1982, 1983–84b, 1993). The latter is a sibling of WMP languages and a child of PMP. To our knowledge, no scholar disputes the claim that POc is descended from PMP. However, two recent pieces of research raise the need to look more closely at the intervening stages between PMP and POc.

The first, Kamholz (2014), uses a much larger body of evidence to establish the integrity of Blust’s (1978) PSHWNG on the basis of shared innovations. Kamholz does not examine the probity of PEMP, but the innovations that define his PSHWNG are different enough from those defining POc to invite a re-examination of the PEMP hypothesis.

The other work is Grimes & Edwards’ (in prep.) analysis of available CMP data. They identify eight CMP subgroups on the basis of mostly shared phonological innovations. They find areal similarities, some of them probably consequences of one or more Papuan substrates (see also Schapper 2015; 2018), but no significant exclusively shared innovations across subgroups, and thus no evidence for a putative Proto Central Malayo-Polynesian.

Blust (1993, 2009b) views the CMP languages as a linkage on the basis of innovations that chain (§1.4.3.1) various groups together,³⁸ but Grimes & Edwards find little evidence to support such an analysis. Blust’s arguments for PCEMP have evoked vigorous criticism (Donohue & Grimes 2008; Schapper 2011) and responses (Blust 2009b, 2012). The lack of evidence for Proto Central Malayo-Polynesian logically entails abandoning PCEMP as well, and this leaves a gap in the the prehistory of POc according to the accepted hypothesis.

Kamholz and Grimes & Edwards indirectly prompt a further look at two POc-related questions:

Are the SHWNG languages the closest relatives of Oceanic?
How are SHWNG and Oceanic related to CMP groups?

Our answer to (a) is, no, the SHWNG languages are probably not the closest relatives of Oceanic. Our answer to (b) is that SHWNG appears more closely related to some of the CMP groups than to Oceanic, while the relationship of Oceanic to CMP languages is ambiguous, implying that it may have branched off the Austronesian tree separately from CMP, perhaps at a node from which various CMP groups branched, or perhaps at a higher node. We can only give a summary of findings here (for more detail see Ross, in prep.).

One other answer to the question, “Where did Proto Oceanic come from?” is implicit in the literature, and it would be remiss of us not to mention it. Bellwood (2011) suggests that Lapita pottery displays a likeness to contemporaneous pottery from the Marianas Islands in Micronesia. As far as we know, the only language then spoken in the Marianas was an earlier form of Chamorro, which originated in the northern Philippines (Blust 2000a). Bellwood’s hypothesis might imply a flow of early Chamorro speakers into the Bismarck archipelago, but there is no linguistic indication of such a presence in POc or its descendants.³⁹

9.1. Blust (1978) on PEMP⇫

Much of Blust (1978), the seminal work on PEMP, is devoted to demonstrating the integrity of SHWNG. Kamholz’s (2014) analysis agrees. A smaller part of Blust’s paper is devoted to PEMP, i.e. to innovations shared by SHWNG and POc. Blust offers 53 shared lexical innovations, but no shared phonological or morphosyntactic innovation

Claiming an exclusively shared lexical innovation carries with it an inherent risk. Might not the next dictionary of a non-EMP language include a cognate that renders the innovation non-exclusive and thereby non-probative? Of the 53 innovations, Ross (in prep.) rejects 32, or 60%, for the following reasons:

Eight are also found in one of the CMP groups to the west and south of SHWNG. The groups are, in Grimes & Edwards’ terminology, Seram-Tanimbar-Bomberai (6 innovations), Ambon-Seram (2), and Sula-Buru (1) (Map 1.6).⁴⁰
Seven have cognates in WMP languages.
For 14, Ross was unable to verify the supporting data. Their PMP reconstructions are absent from the ACD, implying that Blust later abandoned them.
One, *ma- ‘directional particle’, is likely to be the result of drift, i.e. independent parallel innovation.
One, *dui ‘dugong’, is interpreted as an idiosyncratic innovation in the word form, but it is the outcome of regular phonological changes.
One, *mawa ‘enclosed space’, appears to be a chance resemblance.

Map 1.6: Grimes & Edwards’ Wallacean groups mentioned in the text — **Map 1.6:** Grimes & Edwards’ Wallacean groups mentioned in the text

9.2. Phonological innovations in Oceanic and Wallacean languages⇫

It is convenient to refer to CMP and SHWNG languages together as the Austronesian languages of “linguistic Wallacea” (Schapper 2016), or, more simply in the present context, as Wallacean.

Table 1.8 shows innovations in consonants in the protolanguages of Oceanic and various Wallacean subgroups including SHWNG and others clustered close to it.⁴¹ The table makes no reference to innovations that occur in smaller subgroups within those shown. Often one or more of the innovations listed in the table does not occur in a subgroup’s parent language but does occur in lower-order subgroup(s) within it. This is part and parcel of the Wallacean pattern of shared innovations whereby isoglosses intersect, forming possible linkages. However, close inspection of the innovations shows that they affect certain PMP consonants across two or more Wallacean groups, suggesting that drift resulting from pressures on similar consonant systems is as likely a cause as shared inheritance.

**Table 1.8.** Consonant innovations in the parent languages of Oceanic and Wallacean subgroups (key beneath table)
PMP >	Oceanic	SHWNG	Ambon-Seram	Seram-Tanimbar-Bomberai	Aru	Sula-Buru
p > f		yes	yes	yes	init
p > f > *h					med
p > b	some
b > p	*β	some	yes		yes	yes
t > s/__*i_		yes
mp/mb > *ᵐb	some	yes		yes	?
mp/mb > *ᵐp			yes
nt/nd > *ⁿd		yes	yes	yes	yes
d > d-r-	some	yes	yes
d > r	some			yes	yes	yes
d > dr [ⁿr]	some
d/z > *d			yes
d/z > *r			yes
d/l > *r		yes
-j-/s > *s		yes
-j-/s > *j [ɟ]	some
-j-/l > *l			yes			some
*-j- > 0̸						some
-j-/R > *R			yes		yes
z/s > *s	yes
z/y merge					yes
ŋ > n			yes
q > 0̸		yes	some		yes	yes
*qa- etc lost		yes	some		yes	yes

‘some’ indicates that the change unpredictably applies to some etyma but not others;
an empty cell means ‘no’.

The innovation listed as ‘*qa- etc lost’ in the bottom row of Table 1.8 needs an explanation. It refers to the fact that words of three or more syllables of which the first PMP syllable was *qa- or *ha- regularly lose that syllable in most Wallacean languages. This loss is probably associated with the loss of *q- or *h-, which is almost universal in Wallacean languages. Just one language, Watubela of the Seram-Tanimbar-Bomberai group, clearly retains *q as k, meaning that its retention must be reconstructed to Proto Seram-Tanimbar-Bomberai. Thus, for example, PMP *qateluR ‘egg’ is regularly reflected as POc *qatoluR (vol.4:278–279) and Watubela katlu, but as PSHWNG *tolo (Taba tolo, Mayá tól, Umar tor), Uyir tuli (Aru), Maswiang tolin (STB), Paulohi terur (AS).

What mainly concerns us in Table 1.8 is not the details of the innovations but their patterning and particularly the considerable differences between Oceanic and the Wallacean groups. It is immediately clear that SHWNG innovations pattern more closely with those of other Wallacean subgroups, and barely at all with Oceanic.

As for the innovations of Oceanic, only one, the merger of PMP *s and *z as POc *s, is shared with a Wallacean group, Central Timor, far away from Oceanic. This is presumably a case of independent parallel innovation.

An obvious feature of POc in Table 1.8 is the number of cells containing ‘some’, indicating that the change applied to only some etyma. These refer to the obstruent splits noted in Table 1.6 and the associated discussion in §1.8.2.1 and §1.8.2.3.

Their significance here is that the merger-then-split pattern that gave rise to POc obstruent pairs has not occurred in the history of any Wallacean group. Table 1.9 shows PMP obstruents along with their PSHWNG and POc reflexes. The PSHWNG column shows one reflex for each PMP obstruent and for each PMP pair of nasal + obstruent clusters. This organisation is representative of all Wallacean groups as Grimes & Edwards (in prep.) reconstruct their histories. The POc column, however, shows the pairs of reflexes discussed earlier.

As an example, Figure 1.7 sets out the changes in PMP *p and *b, as they are reflected in PSHWNG and in POc. The PSHWNG changes are simple, and are similar to those in other Wallacean languages. The POc changes are more complex. Both PSHWNG and ePOc have three labial consonants, but they have developed along different routes.⁴²

**Table 1.9.** PMP obstruents and their PSHWNG and POc reflexes
	PMP	PSHWNG	POc
Bilabial	*p	*f	p/b
	*b	*p	p/b
	-Np-/-Nb-	*b	p/ᵐb
Dental	*t	*t	*t
	*-Nt-	*d	*ⁿd
Alveolar	*d	*r	r/dr
	*-Nd-	*d	*dr
Alveolar	*s	*s	s/j
	*z	*z	s/j
	-Ns-/-Nz-	?	s/ñj ?
Velar	*k	*k	k/g ?
	*g	?	*k
	-Nk-/-Ng-	*g	*ᵑg

Figure 1.7: The phonological histories of PSHWNG and POc reflexes of PMP *p and *b — **Figure 1.7:** The phonological histories of PSHWNG and POc reflexes of PMP *p and *b

9.3. Conclusion: so where did Proto Oceanic come from?⇫

Where then did Proto Oceanic come from? The phonological history that gave rise to the patterns in Table 1.9 is unlike that of the Wallacean languages and significantly more complicated. No Wallacean language—and as far as we know, no WMP language—underwent a set of obstruent mergers like those that gave rise to Proto X, followed by the set of splits that gave rise to the POc. Wallacean languages other than the Sula-Buru group, however, display a merger, of PMP *-Nt- and *-Nd-, where POc has no merger. This implies that the ancestor of POc was separate from the ancestor(s) of the Wallacean languages when the Wallacean merger occurred.

These differences, along with those in Table 1.8, indicate that POc has a history that is markedly different from those of the Wallacean languages, including SHWNG, and that Blust’s PEMP hypothesis is not valid, even though it was perfectly reasonable when it was proposed forty-five years ago. The question is, what do we replace it with? It is now obvious that it is not a Wallacean offshoot, so where did it come from, genealogically? We don’t know.

Figure 1.8: Schematic diagram showing the implications of our analysis for the genealogy of the Austronesian family. — **Figure 1.8:** Schematic diagram showing the implications of our analysis for the genealogy of the Austronesian family.

Figure 1.8 shows our dilemma. Do the Wallacean languages and POc have a common ancestor? There is some lexical evidence that they do, in the shape of the PCEMP etyma in the ACD and the 1978 PEMP etyma that are now known to have Wallacean cognates (§1.9.1), but, as we have observed, using lexical data in this way has disadvantages. These are matters for future research.

Meanwhile, we can say that using the lexical reconstructions in volumes 1–5 as sources for phonological history has proven to be a fitting conclusion to the present work.

Notes⇫

This introduction incorporates material in the introductions to Volumes 1–5, replicated so that each volume can be used independently, but also includes new material (§1.8 and §1.9). Our presentation of Oceanic subgrouping was revised in the introduction to volume 3, and this is largely retained here. We are indebted to Charles Grimes and Owen Edwards for their comments, especially on §§1.7-1.8.↩︎
The project, the brainchild of Andrew Pawley, has been jointly directed by him and by Malcolm Ross, with research assistance from Meredith Osmond, in the Department of Linguistics of the Research School of Pacific and Asian Studies and its successor, the College of Asia and the Pacific, at the Australian National University.↩︎
Ethnologue (Eberhard et al. 2022) lists 513 Oceanic languages, Glottolog 4.6 (Hammarström et al. 2022) lists 521. The two Micronesian exceptions are Chamorro in the Marianas and Palau, both apparently single-language branches within western Malayo-Polynesian (see Figure 1.5). There is broad agreement that speakers of pre-Chamorro migrated from the northern Philippines, but the origin of Palauan remains a mystery (Blust 2000a; Reid 2002; Smith 2017). Zobel 2002 gives an alternative view.↩︎
Terms for people (rather than for kinship or rank) are reconstructed in vol.5, ch.2. They include ‘person’, ‘woman’, ‘man’, age cohort terms from early childhood to old age, terms for people by absence or deprivation of relationship (‘orphan’, ‘unmarried adult’, ‘widow(er)’) and for twins.↩︎
The 2020 ‘frozen’ ACD continues to be stored at the University of Hawai‘i (http://www.trussel2.com/acd/), but is now also available and under development in somewhat different format as part of the Cross-Linguistic Linked Data project (https://acd.clld.org/).↩︎
Differences in meaning are ignored here, but see §1.4.1.↩︎
A hyphen before or after a form for ‘four’ indicates the addition of a numeral classifier (§14.1.1).↩︎
For a lucid and concise account of the history of the matters we touch on here, and of the matters themselves, see François (2014).↩︎
In previous volumes, Appendix B, showing the groupings of Oceanic languages, followed Ross (1988), using the term family as a synonym for subgroup. This confusing usage is abandoned here.↩︎
In the jargon of biological phylogenetics a shared innovation is a synapomorphy.↩︎
Figure 1.4 and the right-hand diagram of Figure 1.2 were inspired by François’ (2014) diagram 6.3.↩︎
In this discussion of linkages, ‘language’ is used to mean ‘language or dialect’.↩︎
‘Eastern Fijian languages’ in Figure 1.1 is our label for Geraghty’s (1983) ‘Tokalau Fijian’.↩︎
WOc also includes the Sarmi/Jayapura (SJ) group (see Map 1.1). It may belong to the NNG linkage, but this is uncertain Ross (1996b).↩︎
A second Oceanic reflex, ’Are’are rapi ‘a twin; two stones in one fruit’, was later added in the ACD.↩︎
Bearers of the Lapita culture had settled various parts of the Bismarck Archipelago by around 1400 BC (Specht 2007) and colonised the Reefs and Santa Cruz Is. in the Temotu Archipelago, Vanuatu and New Caledonia by about 1000 BC (Green 2003; Green, Jones & Sheppard 2008; Sand 2001b). Maybe a century later they settled in Fiji (Nunn et al. 2004; Clark & Anderson 2009). They reached Tonga by 850 BC (Burley & Connaughton 2007), Samoa by 750 BC (Clark and Anderson 2009).↩︎
We included these nodes in the corresponding tree in Figure 1 of volumes 1 and 2, but this was too easily interpreted as a statement of our views on subgrouping..↩︎
The term ‘Eastern Oceanic’ and the search for evidence of an Eastern Oceanic subgroup has a relatively long pedigree in Oceanic linguistics (Biggs 1965; Pawley 1972, 1977; Lynch & Tryon 1985; Geraghty 1990). However, by the time volume 1 of the present work was published in 1998 it was evident that no convincing evidence supported an Eastern Oceanic subgroup. Our use of the term here is more inclusive than most, resembling the ‘Central/Eastern Oceanic’ of Lynch & Tryon (1983) (the 1985 published version is less inclusive) and of Lynch, Ross & Crowley (2002:94–96), who express reservations about its status.↩︎
Cases where such an inference can be made occur mostly at the boundary (in the Solomon Islands) between Western and Eastern Oceanic. Borrowing is likely (and is often reflected in unexpected sound correspondences) where an etymon occurs (i) in Western Oceanic and only in SE Solomonic languages or (ii) in SE Solomonic languages and only in the NW Solomonic languages (a subgroup within the Meso-Melanesian linkage of Western Oceanic).↩︎
On the positions of Yapese and Mussau, see respectively Ross (1996a) and Ross (1988:315–316, 331).↩︎
The main reason for retaining Ross’s orthography was that the electronic files initially used in this project were drawn in large part from those used in the research reported in Ross (1988).↩︎
Tone is rare in Oceanic languages, and very rare in the data in this volume. Tonal languages are Yabem, Bukawa (both NNG), Cèmuhî, Paicî, Drubea, Kwênyii, Numèè (all five NCal).↩︎
Geraghty (1990:91) records a small number of cases where certain Fijian dialects retain POc *R as l, indicating that it was retained sporadically in PCP. It is always lost in his ‘Tokalau Fijian’ and in Polynesian.↩︎
Another convention sometimes used for this purpose is a double asterisk, e.g. **tau: we prefer the dagger on aesthetic grounds.↩︎
We use Blust’s abbreviations for the groupings he discusses, including “CMP”. We use “central Malayo-Polynesian”, abbreviated “CMP” and “WMP” respectively for the languages of his CMP and Western Malayo-Polynesian when we want to refer to them without Blust’s subgrouping assumptions.↩︎
Blust (1978a) redefined this label as also including the SHWNG languages.↩︎
The notation *-uy(-) reflects the fact that there is one known case where the change to *i occurred word-medially: PMP *kamuihu (independent 2PL pronoun) > *kamuyu > POc *kamiu.↩︎
We replace Dempwolff’s orthographies with those of Table 1.4.↩︎
Dempwolff (1927, 1937) and Milke (1961) both used the term Nasalverbindung for *mp, *mb etc, translated as ‘nasal cluster’ by Milner (1965). Grace coined terms that expressed the pairedness of *p/*mp etc. The assumption that a “nasal grade” consonant reflected an earlier nasal cluster is enshrined in his POc orthography (Table 1.3).↩︎
Grace notes in his 1990 paper that he had written the latter before he had access to Ross (1988).↩︎
This total excludes POc reconstructions for which no ancestor was found in the ACD.↩︎
We are aware that the PCEMP reconstructions for ‘cuscus’ and ‘bandicoot’ are controversial. The POc reconstructions, however, are well supported. See further Grimes & Edwards, in prep.↩︎
The POc digraph ‹dr› was adopted from Fijian orthography to represent POc *[nᵈr], the reflex of *dr in some Admiralties languages (Ross 1988:322) and in most Fijian dialects (Geraghty 1983:184).↩︎
From the small amounts of data in Sheppard (2020), Nimowa appears not to have prenasalised consonants.↩︎
PWOc *-dri reflects the nonhuman member of a human/nonhuman distinction found in the pronominal systems of a number of island languages to the west of New Guinea and adopted in Western Oceanic languages as the ordinary 3pl pronoun.↩︎
The *b- vs *mb- split is reflected, for example, in Dena-Oenal (Rote-Meto) boaʔ vs mbua; Tetun (Timor) fua-n vs bua; Uruangnirin (STB) pua-n vs buok; Masiwang (STB) fua-n vs bua; E Kola (Aru) fūi vs būi; but not in Buru-Lisela (Sula-Buru) fua-n ‘fruit’, fua ‘betelnut’.↩︎
Bradshaw (1978), Bugenhagen & Bugenhagen (2007b), Anderson (2007), Anderson & Ross (2002), Lincoln (1978), Ross’s fieldnotes.↩︎
Blust offers Proto Central Malayo-Polynesian reconstructions. We take this to be a convenient fiction to accommodate the reconstruction of etyma that are reflected only in CMP languages, similar in status to PWOc and PSOc reconstructions in the present work (§1.4.4.3).↩︎
The archaeology of Bellwood’s hypothesis is called into question by Clark & Winter (2019).↩︎
One innovation, *sakaRu ‘reef’, is found in Ambon-Seram and in Chamorro.↩︎
Many of these innovations are identified by Kamholz (2014) for SHWNG and by Grimes & Edwards (in prep.) for CMP languages. Four of the latter’s eight subgroups are shown. The others are the large and internally diverse Flores-Lembata and Timor-Babar subgroups with few shared innovations, the tiny Central Timor group, and their Taliabo group, related to languages of mainland Sulawesi either genealogically or through contact.↩︎
The term “various” in Figure 1.7 refers to the fact that phonemes reflecting PMP nasal + obstruent clusters have at various times in their various Wallacean and their Oceanic histories acquired new members by various processes, for example by abbreviating the PMP stative prefix *ma- to *m- or by reduplication of a syllable with a final nasal (§1.8.2.2).↩︎

Contents

The lexicon of Proto Oceanic: 6 People: society

Chapter 6.1 Introduction

Aims
The present volume
The relation of the current project to previous work
Reconstructing the lexicon
Conventions common to the series
Proto Oceanic bound morphology
Proto Oceanic phonology and orthography
The phonological prehistory of Proto Oceanic
Where did Proto Oceanic come from?

Cognatesets

References

ACD “Austronesian comparative dictionary (ACD)”
Adelaar 1992b “Proto-Malayic”
Adelaar 2004 “The Austronesian languages of Asia and Madagascar: a historical perspective”
Anderson 2007 “Sudest–English dictionary”
Anderson and Ross 2002 “Sudest”
Bellwood 2011 “Holocene population history in the Pacific region as a model for worldwide food producer dispersals”
Bender et al. 1983 “Micronesian cognate sets”
Bender et al. 2003a “Proto-Micronesian Reconstructions–1”
Bender et al. 2003b “Proto-Micronesian reconstructions–2”
Biggs 1965 “Direct and indirect inheritance in Rotuman”
Biggs 1978 “The history of Polynesian phonology”
Biggs and Clark 1993 “POLLEX (Polynesian lexicon)”
Blust 1970 “Proto-Austronesian Addenda”
Blust 1977a “A rediscovered Austronesian comparative paradigm”
Blust 1977b “The Proto-Austronesian pronouns and Austronesian subgrouping: a preliminary report”
Blust 1978a “Eastern Malayo-Polynesian: a subgrouping argument”
Blust 1980a “Early Austronesian social organization: the evidence of language”
Blust 1980c “Notes on Proto-Malayo-Polynesian phratry dualism”
Blust 1981a “Some remarks on labiovelar correspondences in Oceanic languages”
Blust 1982 “The linguistic value of the Wallace Line”
Blust 1982b “The Proto Austronesian word for “female””
Blust 1983 “A linguistic key to the early Austronesian spirit world”
Blust 1983–84a “Austronesian etymologies II”
Blust 1984 “A Mussau vocabulary, with phonological notes”
Blust 1986 “Austronesian etymologies – III”
Blust 1987 “Lexical reconstruction and semantic reconstruction: the case of Austronesian ‘house’ words”
Blust 1989 “Austronesian etymologies – IV”
Blust 1993 “Central and Central-Eastern Malayo-Polynesian”
Blust 1994 “Proto Malayo-Polynesian sibling terms”
Blust 1996a “The Neogrammarian Hypothesis and pandemic irregularity”
Blust 1998a “A note on higher-order subgroups in Oceanic”
Blust 1999 “Subgrouping, circularity and extinction: Some issues in Austronesian comparative linguistics”
Blust 2000a “Chamorro Historical Phonology”
Blust 2009b “The position of the languages of Eastern Indonesia: A reply to Donohue and Grimes”
Blust 2013 “The Austronesian languages. Revised edition”
Blust 2022 “Rare, but real: Native nasal clusters in Northern Philippine languages.”
Bradshaw 1978 “The development of an extra series of obstruents in Numbami”
Bugenhagen and Bugenhagen 2007b “Po Ta Ipiyooto Sua Mbula Uunu: Mbula-English Dictionary”
Burley and Connaughton 2007 “First Lapita settlement and its chronology in Vava’u, Kingdom of Tonga”
Bybee 1994 “A view of phonology from a cognitive and functional perspective”
Cashmore 1969 “Some Proto-Eastern Oceanic reconstructions with reflexes in Southeast Solomon Islands languages”
Clark 1985 “Languages of north and central Vanuatu: groups, chains, clusters and waves”
Clark 2009 “Leo tuai: A comparative lexical study of North and Central Vanuatu languages”
Clark and Anderson 2009 “Colonisation and culture change in Fiji”
Clark and Biggs 2006 “POLLEX: Polynesian lexicon”
Clark and Winter 2019 “The ceramic trail: Evaluating the Marianas and Lapita West Pacific connection”
Collins 1983 “The historical relationships of the languages of Central Maluku”
Dempwolff 1927 “Das austronesische Sprachgut in den melanesischen Sprachen”
Dempwolff 1934 “Vergleichende Lautlehre des Austronesischen Wortschatzes. Vol. 1”
Dempwolff 1937 “Vergleichende Lautlehre des Austronesischen Wortschatzes, Band 2: Deduktive Anwendung des Urindonesischen auf Austronesische Einzelsprachen”
Dempwolff 1938 “Vergleichende Lautlehre des Austronesischen Wortschatzes, Band 3: Austronesisches Wörterverzeichnis”
Donohue and Grimes 2008 “Yet more on the position of the languages of eastern Indonesia”
Dyen and Aberle 1974 “Lexical reconstruction: the case of the Proto-Athapaskan kinship system”
Eberhard et al. 2022 “Ethnologue: Languages of the world. 25th edn.”
François 2011b “Where *R they all? The geography and history of *R-loss in Southern Oceanic languages”
François 2014 “Trees, waves and linkages: Models of language diversification”
French-Wright 1983 “Proto-Oceanic horticultural practices”
Geraghty 1983 “The history of the Fijian languages”
Geraghty 1986 “The sound system of Proto Central Pacific”
Geraghty 1989 “The reconstruction of Proto-Southern Oceanic”
Geraghty 1990 “Proto-Eastern Oceanic *R and its reflexes”
Geraghty 1996 “Problems with Proto Central Pacific”
Geraghty and Pawley 1981 “The relative chronology of some innovations in the Fijian languages”
Grace 1955 “Subgrouping Malayo-Polynesian: a report of tentative findings”
Grace 1959 “The position of the Polynesian languages within the Austronesian (Malayo- Polynesian) language family”
Grace 1969 “A Proto-Oceanic finder list”
Grace 1990 “‘Consonant grade’ in Oceanic languages”
Grace 1996 “Regularity of change in what?”
Green 2003 “The Lapita horizon and traditions: signature for one set of oceanic migrations”
Greenhill and Clark 2011 “POLLEX-Online: the Polynesian Lexical Project Online”
Greenhill et al. 2008 “The Austronesian Basic Vocabulary Database: From bioinformatics to lexomics”
Hammarström et al. 2022 “Glottolog 4.7”
Haudricourt and Ozanne-Rivierre 1982 “Dictionnaire thématique des langues de la région de Hienghène (Nouvelle Caledonie), Pije, Fwâi, Nemi, Jawe (Thematic dictionary of the Hienghène languages, New Caledonia)”
Hockett 1976 “The reconstruction of Proto Central Pacific”
Jackson 1983 “The internal and external relationships in the Trukic languages of Micronesia”
Jackson 1986 “On determining the external relationships of the Micronesian languages”
Kamholz 2014 “Austronesians in Papua: Diversification and change in South Halmahera–West New Guinea”
Lackey and Boerger 2021 “Reexamining the phonological history of Oceanic's Temotu subgroup”
Levy 1979 “The phonological history of the Bugotu-Nggelic languages and its implications for Eastern Oceanic”
Levy 1980 “Languages of the southeast Solomon Islands and the reconstruction of Proto-Eastern-Oceanic”
Lichtenberk 1985a “Possessive constructions in Oceanic languages and Proto-Oceanic”
Lichtenberk 1986 “Leadership in Proto Oceanic society: linguistic evidence”
Lichtenberk 1988 “The Cristobal–Malaitan subgroup of Southeast Solomonic”
Lichtenberk 1994b “Reconstructing heterogeneity”
Lincoln 1978 “The Rai Coast survey data [in two parts]”
Lynch 1978a “Proto-Central Papuan: a reassessment”
Lynch 1980 “Proto-Central Papuan phonology”
Lynch 1984 “On the Proto-Oceanic word for 'citrus'”
Lynch 1999 “Southern Oceanic linguistic history”
Lynch 2000a “Reconstructing Proto-Oceanic stress”
Lynch 2000b “A grammar of Anejom̘”
Lynch 2001b “Some shared developments in pronouns in languages of Southern Oceania”
Lynch 2001c “The linguistic history of southern Vanuatu”
Lynch 2002e “The Proto Oceanic labiovelars: some new observations”
Lynch 2004d “Proto Southern Oceanic reconstructions: body parts and substances”
Lynch 2015 “The phonological history of Iaai”
Lynch and Tryon 1983 “Central Oceanic: a subgrouping hypothesis”
Lynch and Tryon 1985 “Central-Eastern Oceanic: a subgrouping hypothesis”
Lynch et al. 2002 “The Oceanic languages”
Milke 1958 “Zur inneren Gliederung und geschichtlichen Stellung der ozeanisch-Austronesischen Sprachen”
Milke 1958b “Ozeanische Verwandtschaftsnamen”
Milke 1961 “Beiträge zur ozeanischen Linguistik”
Milke 1968 “Proto-Oceanic addenda”
Mills 1991 “Tanimbar-Kei: An eastern Austronesian subgroup”
Milner 1965 “Initial nasal clusters in eastern and western Austronesian”
Næss and Boerger 2008 “Reefs-Santa Cruz as Oceanic: evidence from the verb complex”
Nothofer 1975 “The reconstruction of Proto-Malayo-Javanic”
Nunn et al. 2004 “Early Lapita settlement at Bourewa, southwest Fiji”
Ozanne-Rivierre 1992 “The Proto-Oceanic consonantal system and the languages of New Caledonia”
Ozanne-Rivierre 1995 “Structural changes in the languages of Northern New Caledonia”
Pawley 1972 “On the internal relationships of Eastern Oceanic languages”
Pawley 1975 “The relationships of the Austronesian languages of Central Papua”
Pawley 1977 “On redefining 'Eastern Oceanic'”
Pawley 1978 “The New Guinea Oceanic hypothesis”
Pawley 1982a “Rubbishman commoner, big-man chief: linguistic evidence for hereditary chieftainship in Proto-Oceanic society”
Pawley 1985 “Proto-Oceanic terms for ‘person’: a problem in semantic reconstruction”
Pawley 1996a “On the Polynesian subgroup as a problem for Irwin's continuous settlement hypothesis”
Pawley 1996b “Proto Oceanic terms for reef and shoreline invertebrates”
Pawley 1996c “On the position of Rotuman”
Pawley 2008 “Where and when was Proto Oceanic spoken? Linguistic and archaeological evidence”
Pawley 2009 “The role of the Solomon Islands in the first settlement of Remote Oceania: bringing linguistic evidence to an archaeological debate”
Pawley 2011 “On the position of Bugotu and Gela in the Guadalcanal-Nggelic subgroup of Oceanic”
Pawley and Green 1984 “The Proto-Oceanic language community”
Pawley and Ross 1994 “Austronesian terminologies: continuity and change”
Pawley and Ross 1995 “The prehistory of Oceanic languages: a current view”
Reid 1982 “The demise of Proto-Philippines”
Reid 2000 “Sources of Proto-Oceanic initial prenasalization: The view from outside Oceanic”
Reid 2002 “Morphosyntactic evidence for the position of Chamorro in the Austronesian language family”
Ross 1988 “Proto Oceanic and the Austronesian languages of western Melanesia”
Ross 1989b “Proto-Oceanic consonant grade and Milke's */nj/”
Ross 1992b “The position of Gumawana among the languages of the Papuan Tip cluster”
Ross 1994a “Central Papuan culture history: some lexical evidence”
Ross 1995b “Some current issues in Austronesian linguistics”
Ross 1996a “Is Yapese Oceanic?”
Ross 1996b “On the genetic affiliations of the Oceanic languages of Irian Jaya”
Ross 2011 “Proto-Oceanic *kʷ”
Ross 2014 “Reconstructing the history of languages in northwest New Britain: Inheritance and contact”
Ross 2017 “Linguistic evidence for prehistory: Oceanic examples”
Ross and Næss 2007 “An Oceanic origin for Aiwoo, the language of the Reef Islands?”
Sand 2001 “Evolutions in the Lapita cultural complex: a view from the Southern Lapita Province”
Schapper 2015 “Wallacea, a linguistic area”
Sheppard 2020 “Verbal morphosyntax and three-participant events in Sudest, an Oceanic language of Papua New Guinea”
Smith 2017 “The Western Malayo-Polynesian Problem”
Sneddon 1984 “Proto-Sangiric and the Sangiric languages”
Specht 2007 “Small islands in the big picture: the formative period of Lapita in the Bismarck Archipelago”
Tryon 1976 “New Hebrides languages: an internal classification”
Tryon and Hackman 1983 “Solomon Islands languages: an internal classification”
Tsuchida 1976 “Reconstruction of Proto-Tsouic phonology”
Walsh and Biggs 1966 “Proto-Polynesian word list 1”
Walter 1989 “Lapita fishing strategies: a review of the archaeological and linguistic evidence”
Walworth 2014 “Eastern Polynesian: The Linguistic Evidence Revisited.”
Waterson 1993 “Houses and the built environment in Island Southeast Asia”
Zobel 2002 “The position of Chamorro and Palauan in the Austronesian family tree: evidence from verb morphosyntax”
Zorc 1977 “The Bisayan dialects of the Philippines: subgrouping and reconstruction”
Zorc 1986 “The genetic relationships of Philippine languages”

Chapter 6.1 Introduction

1. Aims1⇫

2. The present volume⇫

3. The relation of the current project to previous work⇫

4. Reconstructing the lexicon⇫

4.1. Terminological reconstruction⇫

4.2. Sound correspondences⇫

4.3. The internal structure of the Oceanic subgroup of the Austronesian family⇫

4.3.1. Subgroups and linkages⇫

4.3.2. Oceanic linkages⇫

4.3.3. Oceanic subgroups⇫

4.4. Criteria for reconstruction⇫

4.4.1. The distributional criterion⇫

4.4.2. Which protolanguage? Handling the Oceanic tree’s rakelike structure⇫

FIXME: relabel PEOc reconstructions in vol1 and vol2?

4.4.3. Which protolanguage? Handling linkages⇫

5. Conventions common to the series⇫

5.1. Presentation of reconstructions⇫

5.2. Data⇫

5.3. Conventions used in representing reconstructions⇫

6. Proto Oceanic bound morphology⇫

7. Proto Oceanic phonology and orthography⇫

8. The phonological prehistory of Proto Oceanic⇫

8.1. The Proto Austronesian and Proto Malayo-Polynesian antecedents of Proto Oceanic phonology⇫

8.2. Reinterpreting the origins and distribution of POc oral- and nasal-grade consonants⇫

8.2.1. The POc voiceless and voiced obstruents⇫

8.2.2. The POc prenasalised obstruents⇫

8.2.3. PMP *t, *d and *r⇫

8.2.4. More evidence for POc prenasalised obstruents⇫

8.3. Revising the history of Proto Oceanic vowels?⇫

9. Where did Proto Oceanic come from?⇫

9.1. Blust (1978) on PEMP⇫

9.2. Phonological innovations in Oceanic and Wallacean languages⇫

9.3. Conclusion: so where did Proto Oceanic come from?⇫

Notes⇫

1. Aims¹ ⇫

8.2.3. PMP t, d and *r⇫