/Human-Language-Technologies

This is my portfolio for CS 4395.001 at UT Dallas

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

Human-Language-Technologies

This is my portfolio for CS 4395.001 at UT Dallas

Overview of NLP

This is my first assignment for the class. It required me to answer questions based on NLP

My responses are here and the assignment description is here

Text Processing with Python

Overview

This assignment reads in a CSV file and verifies that all of the data is uniform. If an entry does not follow the standard, then it will prompt the user to enter a new entry until it is follows the set standard.

Instructions

To run this program, run the python file with the location of the data.csv in the argument.

Text Processing

A major strength that I found in using python is that it is much simplier to process text than in other languages. A weakness would be the speed at runtime.

What I learned

In this assignment I learned about applying regrex in a program. In the past I had to create it for Automata Theory, however, we need not actually use it in a program.

My code is here and the assignment description is here

Text Processing with Python

This assignment was my first time using NLTK. For the first half of the assignment, I used one of the built in texts in the NLTK API. For the text, I extracted and printed tokens, used the concordance method, and compared the API count method with the default pyhton count method. Next, I got to choose my own text to process and them stem.

The notebook is here, the pdf is here, and the assignment description is here

Guessing Game

I had a ton of fun with this assignment. I had to use NLTK to extract the 50 most common nouns in the text file provided by the professor and create a word game out of it. The word game is similar to hangman, however you play until you miss 5 more letters than you get correct.

Instructions

To run this program, first download the python program and the text file and save them in the same folder. You can use a different text file if you would like, just be mindful that it is long enough and includes 50 nouns. When you run the python file, make sure to pass the text file name as the argument in the command. The game is relatively straight forward to play. Just type a letter as your guess and press enter. To exit the game, input the character ‘!’.

Example command: py assignment3_mdp180000.py anat19.txt

My code is here and the assignment description is here

WordNet

My code is here and the assignment description is here

N-grams

Part 1 of my code is here and part 2 of my code is here. The assignment description is here

Web Crawler

My code is here, my writeup is here, and the assignment description is here