NexTech 2021 Congress
October 03, 2021 to October 07, 2021 - Barcelona, Spain

  • UBICOMM 2021, The Fifteenth International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies
  • ADVCOMP 2021, The Fifteenth International Conference on Advanced Engineering Computing and Applications in Sciences
  • SEMAPRO 2021, The Fifteenth International Conference on Advances in Semantic Processing
  • AMBIENT 2021, The Eleventh International Conference on Ambient Computing, Applications, Services and Technologies
  • EMERGING 2021, The Thirteenth International Conference on Emerging Networks and Systems Intelligence
  • DATA ANALYTICS 2021, The Tenth International Conference on Data Analytics
  • GLOBAL HEALTH 2021, The Tenth International Conference on Global Health Challenges
  • CYBER 2021, The Sixth International Conference on Cyber-Technologies and Cyber-Systems

SoftNet 2021 Congress
October 03, 2021 to October 07, 2021 - Barcelona, Spain

  • ICSEA 2021, The Sixteenth International Conference on Software Engineering Advances
  • ICSNC 2021, The Sixteenth International Conference on Systems and Networks Communications
  • CENTRIC 2021, The Fourteenth International Conference on Advances in Human-oriented and Personalized Mechanisms, Technologies, and Services
  • VALID 2021, The Thirteenth International Conference on Advances in System Testing and Validation Lifecycle
  • SIMUL 2021, The Thirteenth International Conference on Advances in System Simulation
  • SOTICS 2021, The Eleventh International Conference on Social Media Technologies, Communication, and Informatics
  • INNOV 2021, The Tenth International Conference on Communications, Computation, Networks and Technologies
  • HEALTHINFO 2021, The Sixth International Conference on Informatics and Assistive Technologies for Health-Care, Medical Support and Wellbeing

NetWare 2021 Congress
November 14, 2021 to November 18, 2021 - Athens, Greece

  • SENSORCOMM 2021, The Fifteenth International Conference on Sensor Technologies and Applications
  • SENSORDEVICES 2021, The Twelfth International Conference on Sensor Device Technologies and Applications
  • SECURWARE 2021, The Fifteenth International Conference on Emerging Security Information, Systems and Technologies
  • AFIN 2021, The Thirteenth International Conference on Advances in Future Internet
  • CENICS 2021, The Fourteenth International Conference on Advances in Circuits, Electronics and Micro-electronics
  • ICQNM 2021, The Fifteenth International Conference on Quantum, Nano/Bio, and Micro Technologies
  • FASSI 2021, The Seventh International Conference on Fundamentals and Advances in Software Systems Integration
  • GREEN 2021, The Sixth International Conference on Green Communications, Computing and Technologies

TrendNews 2021 Congress
November 14, 2021 to November 18, 2021 - Athens, Greece

  • CORETA 2021, Advances on Core Technologies and Applications
  • DIGITAL 2021, Advances on Societal Digital Transformation

 


ThinkMind // DBKDA 2011, The Third International Conference on Advances in Databases, Knowledge, and Data Applications // View article dbkda_2011_8_40_30118


From Synchronous Corpus to Monitoring Corpus, LIVAC: The Chinese Case

Authors:
Benjamin K. Tsou
Andy C. Chin
Oi Yee Kwong

Keywords: monitoring corpus; synchronous corpus; homothematic coprus; LIVAC; the Chinese language

Abstract:
Very large corpora of properly processed textual materials are uncommon but they can provide important resources for language modeling in natural language processing, ranging from speech processing and text input to automatic IR and patent translation. However, when properly cultivated in spatial-temporal terms, they can foster innovative knowledge discovery in database applications by functioning as monitoring corpus and enhance the human centered communication environment by allowing more substantive introspection and comparison of linguistic and social-cultural developments of the relevant speech communities. This paper discusses how the gigantic synchronous and homothematic corpus of Chinese, LIVAC, can contribute to the monitoring the linguistic homogeneity and heterogeneity diachronically and synchronically. After processing media texts of more than 400 million Chinese characters over 16 years, LIVAC has yielded a lexical corpus of 1.5 million words. This paper examines some aspects of the nature and extent of lexical and morphological divergence and convergence in the Chinese language of Hong Kong, Taipei and Beijing. Additional discussions cover creation and relexification of neologisms, categorial fluidity and the associated challenges to terminology standardization, such as renditions of non-Chinese personal names. This paper also explores how the associated socio-cultural developments can be fruitfully monitored by means of this unique spatial-temporal corpus.

Pages: 175 to 180

Copyright: Copyright (c) IARIA, 2011

Publication date: January 23, 2011

Published in: conference

ISSN: 2308-4332

ISBN: 978-1-61208-115-1

Location: St. Maarten, The Netherlands Antilles

Dates: from January 23, 2011 to January 28, 2011

SERVICES CONTACT
2010 - 2017 © ThinkMind. All rights reserved.
Read Terms of Service and Privacy Policy.