Here's a simple TensorFlow (Keras) example for sentiment analysis using a CSV file. This example assumes the CSV file has two columns: text: The review or sentence. label: 0 for negative, 1 for positive sentiment. step1:- text,label "I love this movie!",1 "This is terrible.",0 "Amazing experience",1 "Awful and boring",0 "Absolutely fantastic!",1 "I hated every minute of it.",0 "A masterpiece of cinema.",1 "Not worth the time.",0 "Brilliant and inspiring.",1 "Poorly written and acted.",0 "Heartwarming and beautiful.",1 "Completely disappointing.",0 "A joy to watch!",1 "I'll never watch this again.",0 "Exceeded my expectations.",1 "The plot made no sense.",0 "Touching and emotional.",1 "Full of clichés and bad jokes.",0 "Simply outstanding!",1 "Terrible from start to finish.",0 "Well-acted and directed.",1 "The acting was painful.",0 "One of the best I've seen.",1 "Boring and predictable.",0 "Highly recommended!",1 "A total waste of time.",0 "Funny and entertaining.",1 "I couldn’t finish it.",0 "Loved the characters.",1 "Nothing good about it.",0 step 2:- sentimnet.py file code:- import pandas as pd import numpy as np import tensorflow as tf from tensorflow.keras.preprocessing.text import Tokenizer from tensorflow.keras.preprocessing.sequence import pad_sequences from sklearn.model_selection import train_test_split from tensorflow.keras.preprocessing.text import tokenizer_from_json # 1. Load CSV Data df = pd.read_csv("sentiment.csv") # Make sure the file is in the same directory texts = df['text'].astype(str).tolist() labels = df['label'].tolist() # 2. Split into Train/Test X_train, X_test, y_train, y_test = train_test_split(texts, labels, test_size=0.2, random_state=42) # 3. Tokenize Text vocab_size = 1000 tokenizer = Tokenizer(num_words=vocab_size, oov_token="") tokenizer.fit_on_texts(X_train) # Convert text to sequences X_train_seq = tokenizer.texts_to_sequences(X_train) X_test_seq = tokenizer.texts_to_sequences(X_test) # 4. Pad Sequences max_length = 100 X_train_pad = pad_sequences(X_train_seq, maxlen=max_length, padding='post') X_test_pad = pad_sequences(X_test_seq, maxlen=max_length, padding='post') # 🔧 Convert labels to numpy arrays to avoid ValueError y_train = np.array(y_train) y_test = np.array(y_test) # 5. Build the Model model = tf.keras.Sequential([ tf.keras.layers.Embedding(vocab_size, 16), tf.keras.layers.GlobalAveragePooling1D(), tf.keras.layers.Dense(16, activation='relu'), tf.keras.layers.Dense(1, activation='sigmoid') ]) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) # 6. Train the Model model.fit(X_train_pad, y_train, epochs=10, validation_data=(X_test_pad, y_test)) # Save the model with proper extension model.save("sentiment_model.keras") # Save tokenizer token_json = tokenizer.to_json() with open('tokenizer.json', 'w') as f: f.write(token_json) print("Model and tokenizer saved.") # 7. Evaluate loss, accuracy = model.evaluate(X_test_pad, y_test) print(f"Test Accuracy: {accuracy:.2f}") # 8. Load Model (use correct filename) loaded_model = tf.keras.models.load_model("sentiment_model.keras") # 9. Load Tokenizer with open('tokenizer.json') as f: token_data = f.read() loaded_tokenizer = tokenizer_from_json(token_data) # 10. Predict on new samples texts_to_predict = [ "I really enjoyed this movie!", "It was a waste of time." ] # Tokenize and pad sequences = loaded_tokenizer.texts_to_sequences(texts_to_predict) padded = pad_sequences(sequences, maxlen=max_length, padding='post') # Predict predictions = loaded_model.predict(padded) for text, pred in zip(texts_to_predict, predictions): sentiment = "Positive" if pred > 0.5 else "Negative" print(f"Text: {text}\nSentiment: {sentiment} (Confidence: {pred[0]:.2f})\n") run it :- python sentiment.py ouput:- Text: I really enjoyed this movie! Sentiment: Positive (Confidence: 0.56) Text: It was a waste of time. Sentiment: Positive (Confidence: 0.55) Details Explanation :- Sure! Here's a detailed explanation of each line of your code: import pandas as pd Imports the pandas library as pd, which is used for data manipulation and analysis, especially for working with tabular data like CSV files. import numpy as np Imports NumPy as np, a library for numerical operations, arrays, and mathematical functions. import tensorflow as tf Imports TensorFlow as tf, a deep learning framework used to build and train machine learning models. from tensorflow.keras.preprocessing.text import Tokenizer Imports Tokenizer, a utility from Keras (part of TensorFlow) to convert text into sequences of integers. from tensorflow.keras.preprocessing.sequence import pad_sequences Imports pad_sequences, a function to make sequences the same length by padding them, which is important for batch processing in neural networks. from sklearn.model_selection import train_test_split Imports train_test_split from scikit-learn to split the dataset into training and testing sets. from tensorflow.keras.preprocessing.text import tokenizer_from_json Imports tokenizer_from_json, which lets you load a tokenizer from a saved JSON file. Data Loading and Preparation: df = pd.read_csv("sentiment.csv") # Make sure the file is in the same directory Reads a CSV file named "sentiment.csv" into a pandas DataFrame df. texts = df['text'].astype(str).tolist() Extracts the "text" column from the DataFrame, converts all entries to strings (in case some are missing or other types), and converts it into a Python list. labels = df['label'].tolist() Extracts the "label" column, which contains sentiment labels (like 0/1), and converts it into a list. Train-Test Split: X_train, X_test, y_train, y_test = train_test_split(texts, labels, test_size=0.2, random_state=42) Splits the data into training and testing sets: 80% training data 20% test data random_state=42 ensures reproducibility of the split Tokenization: vocab_size = 1000 Defines the size of the vocabulary to keep the top 1000 most common words. tokenizer = Tokenizer(num_words=vocab_size, oov_token="") Creates a tokenizer that keeps only the top 1000 words and uses token for out-of-vocabulary words (words not in the top 1000). tokenizer.fit_on_texts(X_train) Builds the word index based on the training texts. X_train_seq = tokenizer.texts_to_sequences(X_train) X_test_seq = tokenizer.texts_to_sequences(X_test) Converts the training and testing text data into sequences of integers corresponding to each word. Padding: max_length = 100 Sets a maximum sequence length of 100 words. X_train_pad = pad_sequences(X_train_seq, maxlen=max_length, padding='post') X_test_pad = pad_sequences(X_test_seq, maxlen=max_length, padding='post') Pads sequences shorter than 100 with zeros at the end (post padding) and trims sequences longer than 100 words. Label Preparation: y_train = np.array(y_train) y_test = np.array(y_test) Converts label lists into NumPy arrays for TensorFlow compatibility. Model Building: model = tf.keras.Sequential([ tf.keras.layers.Embedding(vocab_size, 16), tf.keras.layers.GlobalAveragePooling1D(), tf.keras.layers.Dense(16, activation='relu'), tf.keras.layers.Dense(1, activation='sigmoid') ]) Defines a simple neural network model: Embedding layer: maps words to 16-dimensional vectors. GlobalAveragePooling1D: averages the embeddings over the sequence length. Dense(16): fully connected layer with 16 neurons and ReLU activation. Dense(1): output layer with sigmoid activation (for binary classification). model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) Compiles the model with: Binary crossentropy loss (since it's a binary classification) Adam optimizer Accuracy metric to track Model Training: model.fit(X_train_pad, y_train, epochs=10, validation_data=(X_test_pad, y_test)) Trains the model for 10 epochs on the training data, validating on the test data each epoch. Saving Model and Tokenizer: model.save("sentiment_model.keras") Saves the trained model in a .keras file format. token_json = tokenizer.to_json() with open('tokenizer.json', 'w') as f: f.write(token_json) Saves the tokenizer as a JSON file so it can be reused later. print("Model and tokenizer saved.") Confirms saving is complete. Evaluation: loss, accuracy = model.evaluate(X_test_pad, y_test) print(f"Test Accuracy: {accuracy:.2f}") Evaluates the model on test data and prints test accuracy. Loading Model and Tokenizer: loaded_model = tf.keras.models.load_model("sentiment_model.keras") Loads the saved model from disk. with open('tokenizer.json') as f: token_data = f.read() loaded_tokenizer = tokenizer_from_json(token_data) Loads the tokenizer from the saved JSON file. Prediction on New Samples: texts_to_predict = [ "I really enjoyed this movie!", "It was a waste of time." ] Defines new sample texts to classify. sequences = loaded_tokenizer.texts_to_sequences(texts_to_predict) padded = pad_sequences(sequences, maxlen=max_length, padding='post') Tokenizes and pads these new texts just like the training data. predictions = loaded_model.predict(padded) Predicts sentiment scores for the new samples. for text, pred in zip(texts_to_predict, predictions): sentiment = "Positive" if pred > 0.5 else "Negative" print(f"Text: {text}\nSentiment: {sentiment} (Confidence: {pred[0]:.2f})\n") Prints each input text with predicted sentiment ("Positive" if score > 0.5 else "Negative") and confidence score.