Transcribing Filipino Syllables into Baybayin Script Using Convolutional Neural Network with Long Short-Term Memory Architecture for Spoken Tagalog Recognition
Keywords:
Speech recognition, Baybayin, machine learning, phoneme mapping, transcribingAbstract
This study describes the use of machine learning technologies in the conversion of spoken Tagalog from syllables to the Baybayin script, which was used in the Philippines long before the coming of the Spaniards. The model integrated audio data that dealt with phonetic aspects and correct mapping to Baybayin symbols. The model's overall accuracy is 96%, which in turn shows the reliability of performance in segment speaking of Tagalog into Baybayin text. The CNN-LSTM architecture was proved effective, underscoring the potential of advanced speech recognition technologies for cultural preservation. By modeling the phonetic-to-symbolic relationships in spoken Tagalog, the system offers valuable contributions to linguistic research, especially in areas such as phonology, orthography, and morpho-syllabic analysis of Filipino, thereby bridging traditional scripts and modern language technologies. This study emphasizes the need for further dataset expansion and support for diverse linguistic variations to enhance the system’s inclusivity and applicability. It is an important addition to technology-based cultural preservation, paving the way for similar projects in other languages and scripts. Further studies may enhance this system and make a transcription into sentences, phrases, and paragraphs.