Oyemi Use Cases

Tutorial

Semantic Word Encoding

Learn how to encode words into deterministic numeric codes that capture semantic meaning, part of speech, abstractness, and sentiment valence

What is Oyemi?

Oyemi is an offline semantic lexicon that converts words into structured numeric codes. Unlike word embeddings (Word2Vec, GloVe), Oyemi codes are:

145K+ Words in Lexicon

100% Deterministic

0 Runtime Dependencies

95%+ Valence Accuracy

Code Format HHHH-LLLLL-P-A-V

Component	Meaning	Example
HHHH	Semantic superclass (category)	0121 = emotion.fear
LLLLL	Local synset ID	00003 = specific sense
P	Part of speech	1=noun, 2=verb, 3=adj, 4=adv
A	Abstractness	0=concrete, 1=mixed, 2=abstract
V	Valence (sentiment)	0=neutral, 1=positive, 2=negative

Installation

Install Oyemi from PyPI. The package includes the pre-built lexicon - no additional downloads required:

Bash terminal

pip install oyemi

Output

Successfully installed oyemi-3.0.1

Basic Word Encoding

Encode words to get their semantic codes. Words with multiple meanings return multiple codes:

Python basic_encoding.py

from Oyemi import Encoder

# Initialize encoder
enc = Encoder()

# Encode a simple word
codes = enc.encode("happy")
print("Codes for 'happy':", codes)

# Polysemous word (multiple meanings)
codes = enc.encode("bank")
print("Codes for 'bank':", codes[:3])  # First 3 senses

# Check lexicon size
print(f"Lexicon contains {enc.word_count:,} words")

Output

Codes for 'happy': ['3010-00001-3-1-1', '3999-05469-3-1-1', '3999-05731-3-1-1']
Codes for 'bank': ['0174-00012-1-0-0', '0045-00089-1-0-0', '2030-00156-2-1-0']
Lexicon contains 145,014 words

Parsed Semantic Codes

Use encode_parsed() to get structured SemanticCode objects with named attributes:

Python parsed_encoding.py

# Get parsed semantic codes
parsed = enc.encode_parsed("fear")

# Examine the primary sense
primary = parsed[0]
print(f"Word: fear")
print(f"  Code: {primary.raw}")
print(f"  Superclass: {primary.superclass}")
print(f"  Part of Speech: {primary.pos_name}")
print(f"  Abstractness: {primary.abstractness_name}")
print(f"  Valence: {primary.valence_name}")

# Compare positive vs negative words
for word in ["love", "hate", "table"]:
    p = enc.encode_parsed(word)[0]
    print(f"{word:10} -> {p.valence_name}")

Output

Word: fear
  Code: 0121-00003-1-2-2
  Superclass: 0121
  Part of Speech: noun
  Abstractness: abstract
  Valence: negative

love       -> positive
hate       -> negative
table      -> neutral

Sentiment Valence Detection

Oyemi provides built-in sentiment detection without any ML models. Perfect for deterministic text analysis:

Python valence_detection.py

# Analyze sentiment of a sentence
sentence = "The manager was incompetent and the layoffs were devastating"

# Tokenize and analyze
words = sentence.lower().split()
valence_counts = {'positive': 0, 'negative': 0, 'neutral': 0}

for word in words:
    try:
        parsed = enc.encode_parsed(word, raise_on_unknown=False)
        if parsed:
            valence = parsed[0].valence_name
            valence_counts[valence] += 1
            if valence != 'neutral':
                print(f"  {word}: {valence}")
    except:
        pass

print(f"\nValence Summary: {valence_counts}")

# Calculate sentiment score
total = sum(valence_counts.values())
score = (valence_counts['positive'] - valence_counts['negative']) / total
print(f"Sentiment Score: {score:.2f}")

Output

  incompetent: negative
  layoffs: negative
  devastating: negative

Valence Summary: {'positive': 0, 'negative': 3, 'neutral': 5}
Sentiment Score: -0.38

Synonym Discovery

Find true synonyms using WordNet synset matching - words that share the same meaning:

Python synonyms.py

from Oyemi import find_synonyms

# Find synonyms for emotional words
for word in ["happy", "angry", "fired"]:
    syns = find_synonyms(word, limit=5)
    print(f"{word}: {syns}")

# Get weighted synonyms (higher weight = closer match)
weighted = find_synonyms("fear", return_weighted=True, limit=5)
print("\nWeighted synonyms for 'fear':")
for syn, weight in weighted:
    print(f"  {syn}: {weight:.2f}")

Output

happy: ['felicitous', 'glad', 'well-chosen']
angry: ['furious', 'raging', 'tempestuous', 'wild']
fired: ['discharged', 'dismissed', 'laid-off', 'pink-slipped']

Weighted synonyms for 'fear':
  dread: 1.00
  fearfulness: 1.00
  fright: 0.85
  reverence: 0.50
  awe: 0.50

Semantic Similarity

Calculate similarity between words based on their semantic codes - no embeddings required:

Python similarity.py

from Oyemi import semantic_similarity

# Compare word pairs
pairs = [
    ("happy", "joyful"),     # Synonyms
    ("happy", "sad"),        # Antonyms
    ("dog", "cat"),          # Same category
    ("dog", "computer"),    # Different categories
    ("layoff", "fired"),    # Related workplace terms
]

print("Semantic Similarity Scores:")
for w1, w2 in pairs:
    sim = semantic_similarity(w1, w2)
    print(f"  {w1:12} <-> {w2:12}: {sim:.2f}")

Output

Semantic Similarity Scores:
  happy        <-> joyful      : 0.85
  happy        <-> sad         : 0.42
  dog          <-> cat         : 0.78
  dog          <-> computer    : 0.15
  layoff       <-> fired       : 0.72

Topic Clustering by Superclass

Group words by their semantic category (superclass) for automatic topic clustering:

Python clustering.py

from Oyemi import cluster_by_superclass

# Words from employee feedback
words = [
    "manager", "boss", "supervisor",  # Leadership
    "salary", "bonus", "compensation",  # Money
    "layoff", "fired", "terminated",  # Employment
    "stress", "anxiety", "fear",  # Emotions
]

# Cluster by semantic category
clusters = cluster_by_superclass(words)

print("Semantic Clusters:")
for superclass, cluster_words in clusters.items():
    print(f"\n  [{superclass}]")
    for w in cluster_words:
        print(f"    - {w}")

Output

Semantic Clusters:

  [0214] Leadership
    - manager
    - boss
    - supervisor

  [0220] Compensation
    - salary
    - bonus
    - compensation

  [0233] Employment Actions
    - layoff
    - fired
    - terminated

  [0121] Emotions
    - stress
    - anxiety
    - fear

Oyemi Capabilities Summary

What you can do with deterministic semantic encoding:

Word Encoding

145K words

Valence Detection

95% accuracy

Synonym Matching

WordNet-based

Semantic Similarity

No ML needed

Topic Clustering

100+ categories

Why Use Oyemi?

100% Deterministic

Same input always produces same output. No model randomness, no training variance. Perfect for regulated industries.

Zero Dependencies

No NLTK, no transformers, no GPU required at runtime. Just pure Python with a bundled SQLite lexicon.

Explainable Output

Every code component has meaning. Superclass 0121 means "emotion.fear" - not a black box 768-dim vector.

Real-World Applications

HR Analytics

Employee Feedback Exit Interviews Survey Analysis

Financial Analysis

10-K Risk Factors Earnings Calls News Sentiment

Customer Intelligence

Review Analysis Complaint Routing Feedback Clustering

Search & Discovery

Semantic Search Query Expansion Similar Items

Try Oyemi Today

Add deterministic semantic encoding to your NLP pipeline in minutes.

Get Started (Free) View on PyPI