WWW version 1.0jg 23.3.99
The data was gathered as part of a lexicostatistical survey of about 100 Tanzanian languages undertaken in the early 1970s by Derek Nurse, Gérard Philippson and a team from the Department of Foreign Languages and Linguistics of the University of Dar es Salaam. For more details on the survey adn the early uses the data was put to, see:
The source (paper) documents for these computer files were sets of printed forms, each set containing 1079 entries in parallel columns of Swahili and English. These forms were distributed to native Tanzanians for translation into the target Tanzanian languages. Each set of forms began with a page of instructions and a short section for particulars on the individual and the language involved in the documentation. [NOTE: These "particulars" are not available on the Web] The printed forms for the survey are numbered in two sections. The main section consists of a wordlist in parallel columns of Swahili and English with room for a translation into the target language. This section contains 1052 entries numbered from 1 to 1038, with entry 929 missing. 15 additional numerical entries are suffixed with the letter a, e.g. entry 50a which follows entry 50, to make up the section's total. Following this is a short section of phrases in English, some of which are translated into Swahili, for which single terms in the target language were sought. This section, containing 27 entries, is lettered from A to Z, with a final entry designated by a slash /. These two sections are combined in the datafiles and in the database.
The computerization of the data is the work of a number of individuals, including: