Active questions tagged nltk

↧

How to download punkt tokenizer in nltk?

December 17, 2024, 12:19 pm

I installed the NLTK library usingpip install nltkand while using the libfrom nltk.tokenize import sent_tokenize sent_tokenize(text)I am getting this errorLookupError:...

View Article

Unable to use nltk functions

December 20, 2024, 10:17 pm

I was trying to run some nltk functions on the UCI spam message dataset but ran into this problem of word_tokenize not working even after downloading dependencies.import nltknltk.download('punkt')from...

View Article

Image may be NSFW.
Clik here to view.

nltk add or remove some abbreviations for the specific project not working

January 30, 2025, 6:19 am

When tokenizing paragraphs in the Czech language, I am observing that some abbreviations are not treated as abbreviations. The paragraph is stored in the file as one long line. The nltk is of the...

View Article

Image may be NSFW.
Clik here to view.

PunktTokenizer does not work with Russian `я.`

February 3, 2025, 4:48 am

When tokenizing paragraphs to sentences in the Russian language, I am observing the special case when the sequence is not treated as the end of the sentence. The case is with the я. at the end of the...

View Article

Python List of Ngrams with frequencies

February 10, 2025, 10:21 pm

I need to get most popular ngrams from text. Ngrams length must be from 1 to 5 words. I know how to get bigrams and trigrams. For example:bigram_measures = nltk.collocations.BigramAssocMeasures()finder...

View Article

Get all possible part-of-speech tags for a word Python

February 12, 2025, 11:19 pm

Is there any way to make this code work with a column on a data frame that contains 1 word only? I just need all POS that a single word can have. Enclosed is an example of pack which can be a NN or...

View Article

Image may be NSFW.
Clik here to view.

How do I remove escape characters from output of nltk.word_tokenize?

February 18, 2025, 12:10 pm

How do I get rid of non-printing (escaped) characters from the output of the nltk.word_tokenize method? I am working through the book 'Natural Language Processing with Python' and am following the code...

View Article

Download data models while installing my python library

February 19, 2025, 11:31 pm

Sometimes, a Python library depends on additional data, such as ML models. This could be a model from transformers, spacy, nltkand so on. Typically there is a command to download such a model:python -m...

View Article

How do I write this into a function in Python 3?

February 26, 2025, 12:34 pm

How would I write this into a function that gives the same output?from nltk.book import text2sorted([word.lower() for word in text2 if len(word)>4 and len(word)<12])

View Article

Fixing Missing NLTK Tokenizer Resources

February 27, 2025, 1:21 pm

Repeated Lookup error eventhough NLTK is downloaded:Resource [93mpunkt_tab[0m not found. Please use the NLTK Downloader to obtain the resource: 31m>>> import nltknltk.download('punkt_tab')...

View Article

Count of Combination of bigrams

February 28, 2025, 2:50 am

I have create a dataset as follows using bigramsindexproduct_action('customer', 'called')action('customer', 'service')action('blue', 'dress')product('the', 'service')product('to',...

View Article

nltk.NaiveBayesClassifier.classify() input parameter

March 4, 2025, 11:53 pm

I have the following trained classifier:classifier = nltk.NaiveBayesClassifier.train(features[:train_count])When I try to use it to classify():result = classifier.classify(feature)and feature is...

View Article

Compare two phrases using WordNet? [closed]

March 7, 2025, 5:35 am

I am trying to compare the semantic of two phrases.In Python I am using nltk and difflib.First I am removing the stop words from the phrases, then I am using WordNetLemmatizer and PorterStemmer to...

View Article

Image may be NSFW.
Clik here to view.

How come I can't import nltk even it's already installed successfully?

March 9, 2025, 10:08 pm

Hi I tried to install nltk from the vscode terminal which is said to be successful but I am still not able to import nltk on python. It said 'no module named'nltk''.I attached my screenshot for clearer...

View Article

How to set Python path for NLTK in Palantir Foundry Python Transform in Code...

March 12, 2025, 8:26 am

I am attempting to create a Python transform that requires me to import nltk. When I import nltk, later on I get:Resource [93mpunkt_tab[0m not found.Please use the NLTK Downloader to obtain the...

View Article

Image may be NSFW.
Clik here to view.

Why nltk word_tokenize is not working even after doing a nltk.download and...

March 13, 2025, 2:07 pm

I am using python 3.7 64 bit. nltk version 3.4.5.When I try to convert text6 in nltk.book to tokens using word_tokenize, I am getting error.import nltkfrom nltk.tokenize import word_tokenizefrom...

View Article

Sentiment Analysis, Naive Bayes Accuracy

March 24, 2025, 5:43 am

I'm trying to form a Naive Bayes Classifier script for sentiment classification of tweets. I'm pasting my whole code here, because I know I will get hell if I don't. So I basically I use NLTK's...

View Article

Naive bayes Classification in Python

March 24, 2025, 5:44 am

I have read all data from csv file usingimport csvimport nltkf = open('C:/Users/Documents/Data/exp.csv')csv_f = csv.reader(f)dataset = []for row in csv_f: dataset.append(row)print (dataset)Now, I want...

View Article

re.sub erroring with "Expected string or bytes-like object"

March 27, 2025, 1:04 am

I have read multiple posts regarding this error, but I still can't figure it out. When I try to loop through my function:def fix_Plan(location): letters_only = re.sub("[^a-zA-Z]", # Search for all...

View Article

Task to convert natural language query to SQL query

March 30, 2025, 12:17 am

I have a task where I have to convert natural language query such is "what is the number of soap in inventory?" to select count(item) from inventory where item="Soap" group by item .I am trying to...

View Article

tokenize sentence into words python

April 14, 2025, 4:37 am

I want to extract information from different sentences, so I'm using nltk to divide each sentence into words. I'm using this code:words=[]for i in range(len(sentences)):...

View Article

How to remove stop words using nltk or python

April 29, 2025, 10:57 am

I have a dataset from which I would like to remove stop words.I used NLTK to get a list of stop words:from nltk.corpus import stopwordsstopwords.words('english')Exactly how do I compare the data to the...

View Article

Python - Sentiment Analysis using Pointwise Mutual Information

May 1, 2025, 10:50 am

from __future__ import divisionimport urllibimport jsonfrom math import logdef hits(word1,word2=""): query = "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=%s" if word2 == "": results...

View Article

Text analysis: finding the most common word in a column

May 13, 2025, 3:27 pm

I have created a dataframe with just a column with the subject line.df = activities.filter(['Subject'],axis=1)df.shapeThis returned this dataframe: Subject0 Call Out: Quadria Capital - May Lo, VP1 Call...

View Article

Python text tokenize code to output results from horizontal to vertical with...

May 25, 2025, 9:28 pm

Below code tokenises the text and identifies the grammar of each tokenised word.import nltkfrom nltk.tokenize import sent_tokenize, word_tokenizefrom nltk.corpus import wordnet as...

View Article

name 'nltk' is not defined

May 27, 2025, 10:01 am

The nltk module is running with other libraries in the corpus folder.My CodeI've already tried putting 'import nltk' at first but it is still the same, and also I've tried 'from nltk.tokenize import...

View Article

comparing synonyms NLTK [duplicate]

May 29, 2025, 8:24 am

I can't come up with a stranger problem, guess you'll help me.for p in wn.synsets('change'):<br>...

View Article

What are `lexpr` and `ApplicationExpression` nltk?

May 29, 2025, 8:30 am

What exactly does lexpr mean and what do the folloring r'/F x.x mean? Also what is Application Expression?from nltk.sem.logic import *lexpr = Expression.fromstringzero = lexpr(r'\F x.x')one =...

View Article

Why am I getting a LookupError: Resource punkt_tab not found in NLTK even...

May 30, 2025, 3:13 am

I’m trying to perform Named Entity Recognition (NER) using NLTK, SpaCy, and a dataset in PyCharm. However, I’m encountering an error related to a missing resource (punkt_tab) when tokenizing text....

View Article

Removing nonsense words in python

June 4, 2025, 4:47 pm

I want to remove nonsense words in my dataset.I tried which I saw StackOverflow something like this:import nltkwords = set(nltk.corpus.words.words())sent = "Io andiamo to the beach with my...

View Article