Natural Language Processing An Introduction

What Is Natural Language Processing?
Natural Language Processing (NLP) is defined as “it is the technology by using which we make the software capable to understand the human’s natural language”.

It is a branch of artificial intelligence and it deals with the interaction between computers and humans using the natural language.

Natural language processing tasks are can be mainly divided into 4 parts and further in sub parts:

  1. Syntax
    1. Grammar induction
    2. Lemmatization
    3. Morphological segmentation
    4. Part-of-speech tagging
    5. Parsing
    6. Sentence breaking
    7. Stemming
    8. Terminology extraction
  2. Semantics
    1. Lexical semantics
    2. Distributional semantics
    3. Machine translation
    4. Named entity recognition (NER)
    5. Natural language generation
    6. Natural language understanding
    7. Optical character recognition (OCR)
    8. Question answering
    9. Recognizing Textual entailment
    10. Relationship extraction
    11. Sentiment analysis
    12. Word sense disambiguation
    13. Topic segmentation and recognition
  3. Discourse
    1. Automatic summarization
    2. Coreference resolution
    3. Discourse analysis
  4. Speech
    1. Speech recognition
    2. Speech segmentation
    3. Text-to-speech
  1. Syntax
    This is sub part of Natural language processing in this section we study about the arrangement of words in a sentence such that they make grammatical sense.
  1. Semantics
    In this section we study about the meaning that is conveyed by a text. This is the difficult part of Natural Language Processing.
  1. Discourse
    In this section we study about the written or spoken communication also we study about the debates summarization and its analysis.
  1. Speech
    In this section we study about the speech like Speech recognition and converting the speech into written text.

Various NLP Libraries written in  Python programming language

Library Details
spaCy Extremely optimized NLP library that is meant to be operated together with deep learning frameworks such as TensorFlow or PyTorch. spaCy comes with pre-trained statistical models and word vectors
Gensim Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora.
Pattern Web (data) mining / crawling and common NLP tasks.
NLTK NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, etc.
TextBlob TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, WordNet integration, parsing, word inflection, adds new models or languages through extensions, and more.
Polyglot Polyglot is a natural language pipeline which supports massive multilingual applications. The features include tokenisation, language detection, named entity recognition, part of speech tagging, sentiment analysis, word embeddings, etc.
Vocabulary Vocabulary is a Python library for natural language processing which is basically a dictionary in the form of Python module. Using this library, for a given word you can get its meaning, synonyms, antonyms, part of speech, translations.
PyNLPl Extensive functionality regarding FoLiA XML and many other common NLP format (CQL, Giza, Moses, ARPA, Timbl, etc.). It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build a simple language model.
Stanford CoreNLP Python Reliable, robust and accurate NLP platform based on a client-server architecture. Written in Java, and accessible through multiple Python wrapper libraries. Quepy is a python framework to transform natural language questions into queries in a database query language.
Quepy Quepy is a python framework to transform natural language questions into queries in a database query language.

What is NLTK?

The Natural Language Toolkit (NLTK) is a platform used for building programs for text analysis  written in the Python programming language.

We will see NLTK in details in next article…

http://mycloudplace.com/tokenize-of-words-and-sentences-using-nltk/

Tokenization of Words and Sentences using NLTK

http://mycloudplace.com/an-introduction-to-machine-learning/

An Introduction To Machine Learning

https://en.wikipedia.org/wiki/Natural_language_processing

3 thoughts on “Natural Language Processing An Introduction”

  1. Pingback: Tokenization of Words and Sentences using NLTK - Mycloudplace

  2. Pingback: An Introduction To Machine Learning - Mycloudplace

  3. Pingback: Artificial Intelligence An Introduction - Mycloudplace

Leave a Comment

Your email address will not be published. Required fields are marked *