KeyNeg in Action

A step-by-step tutorial on extracting negative sentiment from real-world employee feedback data

Tutorial

Glassdoor Employee Reviews Analysis

Learn how to analyze 800K+ employee reviews to extract actionable insights about workplace sentiment

1

The Dataset

We're using the Glassdoor Job Reviews dataset from Kaggle, containing 838,566 employee reviews from various companies. Here's a sample of the data:

Sample Data glassdoor_reviews.csv
firm overall_rating pros cons
IBM 1 Good benefits package Management is completely out of touch with reality. Constant layoffs create fear...
Oracle 2 Good salary No career growth opportunities. Managers play favorites and the work environment is toxic...
Microsoft 2 Great perks Work-life balance is terrible. Constant reorgs make it impossible to focus...
Google 2 Amazing campus Too much bureaucracy. Hard to get promoted without politics...
Apple 1 Prestigious brand Extremely long hours expected. No room for creativity, just follow orders...
838,566 Total Reviews
155,844 Tech Company Reviews
20,587 Negative Reviews (1-2 stars)
200 Sample Analyzed
2

Load and Prepare the Data

First, we load the CSV file and filter for negative reviews (1-2 star ratings) from tech companies:

Python prepare_data.py
import pandas as pd

# Load the Glassdoor reviews dataset
df = pd.read_csv('glassdoor_reviews.csv', encoding='latin-1')

# Define tech companies to analyze
tech_companies = ['ibm', 'oracle', 'microsoft', 'google', 'apple']

# Filter for tech companies (case-insensitive)
tech_df = df[df['firm'].str.lower().isin(tech_companies)]

# Get negative reviews (1-2 star ratings)
negative_reviews = tech_df[tech_df['overall_rating'] <= 2]

# Extract the "cons" column - this is what we'll analyze
cons_text = negative_reviews[['firm', 'cons']].dropna()

print(f"Found {len(cons_text)} negative reviews to analyze")
Output
Found 20,587 negative reviews to analyze
3

Initialize KeyNeg

Import KeyNeg and initialize the analyzer. For enterprise environments, the model is bundled and works offline:

Python initialize.py
# For open source version
from keyneg import KeyNeg

# Or for enterprise version (air-gapped, no internet required)
# from keyneg_enterprise import KeyNeg

# Initialize the analyzer
kn = KeyNeg()

# KeyNeg comes with 95+ built-in sentiment labels including:
# - incompetent management
# - no growth opportunities
# - hostile work environment
# - layoffs
# - work life imbalance
# - and many more...

print("KeyNeg initialized successfully")
Output
KeyNeg initialized successfully
4

Analyze a Single Review

Let's start by analyzing a single employee review to understand the output format:

Python single_analysis.py
# Sample review from IBM employee
review = """Management is completely out of touch with reality.
Constant layoffs create fear and uncertainty. No clear career path
and promotions are based on politics, not merit."""

# Analyze the review
result = kn.analyze(review)

# View the results
print("Top Sentiment:", result['top_sentiment'])
print("Negativity Score:", result['negativity_score'])
print("All Sentiments:", result['sentiments'])
Output
Top Sentiment: incompetent management
Negativity Score: 0.45
All Sentiments: ['incompetent management', 'layoffs', 'no growth opportunities']
5

Batch Analysis

Now let's analyze all 200 sample reviews at once using batch processing for efficiency:

Python batch_analysis.py
# Sample 200 reviews for analysis
sample_reviews = cons_text.sample(n=200, random_state=42)

# Convert to list for batch processing
review_texts = sample_reviews['cons'].tolist()

# Analyze all reviews in batch
results = kn.analyze_batch(review_texts)

# Count sentiment occurrences
from collections import Counter

all_sentiments = []
for r in results:
    all_sentiments.extend(r['sentiments'])

sentiment_counts = Counter(all_sentiments)

# Display top 10 sentiments
print("Top 10 Negative Sentiments:")
for sentiment, count in sentiment_counts.most_common(10):
    print(f"  {sentiment}: {count}")
Output
Top 10 Negative Sentiments:
  incompetent management: 72
  no growth opportunities: 14
  career stagnation: 12
  organizational instability: 10
  layoffs: 9
  hostile work environment: 8
  poor customer service: 6
  lack of collaboration: 4
  dismissive management: 4
  poor leadership: 4
6

Group Results by Company

Let's break down the analysis by company to compare sentiment patterns:

Python company_analysis.py
# Add company info back to results
sample_reviews = sample_reviews.reset_index(drop=True)

# Analyze by company
company_results = {}

for company in tech_companies:
    # Get reviews for this company
    mask = sample_reviews['firm'].str.lower() == company
    company_reviews = sample_reviews[mask]['cons'].tolist()

    if len(company_reviews) > 0:
        # Analyze reviews
        company_analysis = kn.analyze_batch(company_reviews)

        # Calculate average negativity
        avg_neg = sum(r['negativity_score'] for r in company_analysis) / len(company_analysis)

        # Count sentiments
        sentiments = []
        for r in company_analysis:
            sentiments.extend(r['sentiments'])

        company_results[company] = {
            'count': len(company_reviews),
            'avg_negativity': round(avg_neg, 2),
            'top_sentiments': Counter(sentiments).most_common(3)
        }

# Display results
for company, data in company_results.items():
    print(f"\n{company.upper()}")
    print(f"  Reviews: {data['count']}")
    print(f"  Avg Negativity: {data['avg_negativity']}")
    print(f"  Top Issues: {data['top_sentiments']}")
Output
IBM
  Reviews: 108
  Avg Negativity: 0.42
  Top Issues: [('incompetent management', 49), ('no growth opportunities', 9), ('layoffs', 7)]

ORACLE
  Reviews: 48
  Avg Negativity: 0.43
  Top Issues: [('incompetent management', 14), ('organizational instability', 5), ('hostile work environment', 3)]

MICROSOFT
  Reviews: 25
  Avg Negativity: 0.42
  Top Issues: [('incompetent management', 6), ('work life imbalance', 3), ('organizational instability', 3)]

GOOGLE
  Reviews: 8
  Avg Negativity: 0.33
  Top Issues: [('no growth opportunities', 2), ('hostile work environment', 1), ('overworked', 1)]

APPLE
  Reviews: 11
  Avg Negativity: 0.38
  Top Issues: [('poor customer service', 2), ('career stagnation', 2), ('incompetent management', 2)]

Results Visualization

Here's a visual breakdown of the sentiment analysis results:

Incompetent Management
72
No Growth Opportunities
14
Career Stagnation
12
Organizational Instability
10
Layoffs
9
Hostile Work Environment
8
Poor Customer Service
6

Company Comparison

Sentiment breakdown by company:

IBM

108 reviews Negativity: 0.42
Incompetent Management (49) No Growth (9) Layoffs (7)

Oracle

48 reviews Negativity: 0.43
Incompetent Management (14) Org Instability (5) Hostile Environment (3)

Microsoft

25 reviews Negativity: 0.42
Incompetent Management (6) Work-Life Imbalance (3) Org Instability (3)

Apple

11 reviews Negativity: 0.38
Poor Customer Service (2) Career Stagnation (2) Incompetent Mgmt (2)

Google

8 reviews Negativity: 0.33
No Growth (2) Hostile Environment (1) Overworked (1)

Key Insights

Management is #1 Issue

"Incompetent management" appeared in 36% of all negative reviews analyzed, making it the dominant complaint across all companies.

Career Growth Concerns

"No growth opportunities" and "career stagnation" combined account for 13% of complaints - employees want advancement paths.

Google Scores Best

With the lowest negativity score (0.33), Google's negative reviews are less severe compared to IBM (0.42) and Oracle (0.43).

Try KeyNeg on Your Data

Analyze employee surveys, customer feedback, or any text data for negative sentiment patterns.

Dataset: Glassdoor Job Reviews by David Gauthier on Kaggle (838K+ reviews)