Massive lexicon of word relationships

Databases & Networks Massive lexicon of word relationships 279.99 For sale: Over 4 MILLION words and phrases, in several languages, and the semantic relationships between them. Read on. Several years ago, I was doing a project which necessitated finding words that were "related to" one that the user supplied. It wasn't enough to merely import a thesaurus - For instance, if someone typed "lawyer", I didn't want "barrister, attorney, counsel" - I wanted my query to return "divorce, subpoena, accident, lawsuit," etc. Finding that there was no such data easily available even for sale, I had to create my own. Finding words was easy: there are many public-domain dictionaries out there and I merely imported them all and condensed them into one. To find the word relationships, I embarked on a 6 month project of constructing algorithms that find words that are commonly associated by frequency on small documents found on the internet, and in large public-domain literature and periodical collections. The end result is three large SQL tables. The first, "words", contains words in 6 European languages. "related" contains word associations between words using the first of my algorithms, and "associated" contains somewhat looser associations based on the second algorithm. The algorithms used are fairly simple, but the work required to sift the data required several computers running nonstop for nearly all of the 6 months, using custom-made scripts. When all the work was done, the project for which I had originally started this adventure ran out of funding. So now I'm selling to you 6 months worth of intense work for less than $300. What you get here is a Dump of the MySQL database, compressed into a ZIP. When you uncompress it, the resulting text file can be imported into any database management tool to reconstruct the three tables. I guess the thing I'm trying to express here is: this is not a silly little thesaurus. It's a serious pile of data that has had a lot of work put into it, containing word *relationships* that can't be gleaned from any other source. http://www.scubbly.com/item/43981/
Massive lexicon of word relationships
Catégorie: Bases de données et réseaux

Description du produit

For sale: Over 4 MILLION words and phrases, in several languages, and the semantic relationships between them. Read on.

Several years ago, I was doing a project which necessitated finding words that were "related to" one that the user supplied. It wasn't enough to merely import a thesaurus - For instance, if someone typed "lawyer", I didn't want "barrister, attorney, counsel" - I wanted my query to return "divorce, subpoena, accident, lawsuit," etc.

Finding that there was no such data easily available even for sale, I had to create my own.

Finding words was easy: there are many public-domain dictionaries out there and I merely imported them all and condensed them into one. To find the word relationships, I embarked on a 6 month project of constructing algorithms that find words that are commonly associated by frequency on small documents found on the internet, and in large public-domain literature and periodical collections.

The end result is three large SQL tables. The first, "words", contains words in 6 European languages. "related" contains word associations between words using the first of my algorithms, and "associated" contains somewhat looser associations based on the second algorithm.

The algorithms used are fairly simple, but the work required to sift the data required several computers running nonstop for nearly all of the 6 months, using custom-made scripts.

When all the work was done, the project for which I had originally started this adventure ran out of funding. So now I'm selling to you 6 months worth of intense work for less than $300.

What you get here is a Dump of the MySQL database, compressed into a ZIP. When you uncompress it, the resulting text file can be imported into any database management tool to reconstruct the three tables.

I guess the thing I'm trying to express here is: this is not a silly little thesaurus. It's a serious pile of data that has had a lot of work put into it, containing word *relationships* that can't be gleaned from any other source.

$279.99
Environ $279.99 USD
Vous avez ce produit dans votre panier
l'enlever

Gagnez $70.66 par un lien vers cette.
Apprenez comment.

Détails sur le produit

Nom du fichier: lexicon.sql.zip
Taille: 89MB
Ajouté: le 15 octobre 2010

Virus Free
Dernière scan: le 1 novembre 2010

A propos du vendeur

Contacter ce vendeur

Plus de produits:
King of Wands
King of Wands
$6.00 dans Patterns Cross-Stitch
Rider-Waite Tarot - the Cups - Cross-stitch Patterns
Rider-Waite Tarot - the Cups - Cross-stitch Patterns
$49.98 dans Patterns Cross-Stitch

... et 78 plus

Partager cette

Ajouter à votre liste de souhaits Nifty
Partager sur Facebook

Lien vers cette

URL:
Instant Buy URL:

Les liens d'affiliation sont affichés lorsque vous êtes connecté po
Identifiez-vous maintenant.


Intégrer ca

Mettez ce code sur votre site pour un téléchargement "instantané" bouton comme celui ci-dessus

Plus widgets



Réaction

Il n'ya pas de commentaires pour cet article pour le moment.