<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Spelling correction with Soundex</title>
	<atom:link href="http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/feed/" rel="self" type="application/rss+xml" />
	<link>http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/</link>
	<description>Drupal, Web, Mobile, Data, Software Engineering, Development Process Management, Agile</description>
	<lastBuildDate>Thu, 26 Apr 2012 20:17:36 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
	<item>
		<title>By: Vincenzo</title>
		<link>http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/#comment-133685</link>
		<dc:creator>Vincenzo</dc:creator>
		<pubDate>Sat, 03 Mar 2012 14:17:41 +0000</pubDate>
		<guid isPermaLink="false">http://neminis.org/?p=492#comment-133685</guid>
		<description>Then Jaimie you just need to deal with transpositions also, and not just mispelling. Two different problems, two different techniques to tackle them and you implement them both if you want the behaviour you are describing. That&#039;s it. Too much fuss over a simple matter.</description>
		<content:encoded><![CDATA[<p>Then Jaimie you just need to deal with transpositions also, and not just mispelling. Two different problems, two different techniques to tackle them and you implement them both if you want the behaviour you are describing. That&#8217;s it. Too much fuss over a simple matter.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: anonymous</title>
		<link>http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/#comment-130524</link>
		<dc:creator>anonymous</dc:creator>
		<pubDate>Mon, 08 Aug 2011 21:28:47 +0000</pubDate>
		<guid isPermaLink="false">http://neminis.org/?p=492#comment-130524</guid>
		<description>should be

if (array_key_exists($word, $dic[soundex($word)])) {
        return $word;
    }</description>
		<content:encoded><![CDATA[<p>should be</p>
<p>if (array_key_exists($word, $dic[soundex($word)])) {<br />
        return $word;<br />
    }</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: scragar</title>
		<link>http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/#comment-130430</link>
		<dc:creator>scragar</dc:creator>
		<pubDate>Wed, 03 Aug 2011 18:55:20 +0000</pubDate>
		<guid isPermaLink="false">http://neminis.org/?p=492#comment-130430</guid>
		<description>I&#039;ve just found this, and love the idea, I have just written a system(entering testing tomorrow) that stores client details, I think it would be a great idea when updating the client name to store a soundex of the first and last names.
When entering a new name I can use these to match against the possible name matches as efficiently as possible, and I can combine this with the transpose comments, and match the soundex profile for different first characters(most likely to cause problems).</description>
		<content:encoded><![CDATA[<p>I&#8217;ve just found this, and love the idea, I have just written a system(entering testing tomorrow) that stores client details, I think it would be a great idea when updating the client name to store a soundex of the first and last names.<br />
When entering a new name I can use these to match against the possible name matches as efficiently as possible, and I can combine this with the transpose comments, and match the soundex profile for different first characters(most likely to cause problems).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: antique furniture restoration</title>
		<link>http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/#comment-129834</link>
		<dc:creator>antique furniture restoration</dc:creator>
		<pubDate>Sun, 19 Jun 2011 09:03:05 +0000</pubDate>
		<guid isPermaLink="false">http://neminis.org/?p=492#comment-129834</guid>
		<description>I think other web-site proprietors should take this site as an model, very clean and excellent user genial style and design, let alone the content. You are an expert in this topic!</description>
		<content:encoded><![CDATA[<p>I think other web-site proprietors should take this site as an model, very clean and excellent user genial style and design, let alone the content. You are an expert in this topic!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jaimie Sirovich</title>
		<link>http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/#comment-129811</link>
		<dc:creator>Jaimie Sirovich</dc:creator>
		<pubDate>Fri, 17 Jun 2011 16:13:49 +0000</pubDate>
		<guid isPermaLink="false">http://neminis.org/?p=492#comment-129811</guid>
		<description>All I&#039;m saying is, if you have an eCommerce site:

User types &quot;cameras,&quot; &quot;camras,&quot; &quot;camrea,&quot; &quot;kamera,&quot; etc.  He doesn&#039;t care if he typod or spelt it wrong.  He wants cameras.</description>
		<content:encoded><![CDATA[<p>All I&#8217;m saying is, if you have an eCommerce site:</p>
<p>User types &#8220;cameras,&#8221; &#8220;camras,&#8221; &#8220;camrea,&#8221; &#8220;kamera,&#8221; etc.  He doesn&#8217;t care if he typod or spelt it wrong.  He wants cameras.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vincenzo Russo</title>
		<link>http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/#comment-129803</link>
		<dc:creator>Vincenzo Russo</dc:creator>
		<pubDate>Fri, 17 Jun 2011 05:45:25 +0000</pubDate>
		<guid isPermaLink="false">http://neminis.org/?p=492#comment-129803</guid>
		<description>I don&#039;t think we are understanding each other. I might be wrong saying that no one cares about transpositions (I don&#039;t, personally and that always panned out). But the point stands: transposition *is not* a spelling mistake (I am not talking from an end-user point of view, here). And it requires to be specifically considered. Which is what Editex does.</description>
		<content:encoded><![CDATA[<p>I don&#8217;t think we are understanding each other. I might be wrong saying that no one cares about transpositions (I don&#8217;t, personally and that always panned out). But the point stands: transposition *is not* a spelling mistake (I am not talking from an end-user point of view, here). And it requires to be specifically considered. Which is what Editex does.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jaimie Sirovich</title>
		<link>http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/#comment-129800</link>
		<dc:creator>Jaimie Sirovich</dc:creator>
		<pubDate>Fri, 17 Jun 2011 02:09:23 +0000</pubDate>
		<guid isPermaLink="false">http://neminis.org/?p=492#comment-129800</guid>
		<description>I won&#039;t argue with you, but you&#039;re just plain wrong IMO.  I&#039;m sure anyone who wants to write a credible spell checker would have to consider both transposition, other typos, as well as &quot;sounds-like&quot; errors.  And, worse, you can&#039;t consider each one in a vacuum.  That&#039;s silly.  See the &quot;Editex&quot; algorithm.  

They&#039;re all spelling mistakes.  Semantics aside, nobody cares how it happened.  There are numerous gray areas esp. when you consider doubled letters, etc.  Edit distance and typos are the same thing to me.</description>
		<content:encoded><![CDATA[<p>I won&#8217;t argue with you, but you&#8217;re just plain wrong IMO.  I&#8217;m sure anyone who wants to write a credible spell checker would have to consider both transposition, other typos, as well as &#8220;sounds-like&#8221; errors.  And, worse, you can&#8217;t consider each one in a vacuum.  That&#8217;s silly.  See the &#8220;Editex&#8221; algorithm.  </p>
<p>They&#8217;re all spelling mistakes.  Semantics aside, nobody cares how it happened.  There are numerous gray areas esp. when you consider doubled letters, etc.  Edit distance and typos are the same thing to me.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vincenzo</title>
		<link>http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/#comment-129451</link>
		<dc:creator>Vincenzo</dc:creator>
		<pubDate>Tue, 10 May 2011 06:31:20 +0000</pubDate>
		<guid isPermaLink="false">http://neminis.org/?p=492#comment-129451</guid>
		<description>No big assumption. Letter transposition is not spelling mistake. Different problem, different algorithm. As a matter of fact, even Google suggests «actress» as correction for acress. Quite obviously, I&#039;d say.

So, like I said, you could tackle the transposition problem with an additional algorithm, however this leads you to another problem: which of two algorithms should we apply first? How do I know it was a typo and not a genuine mispelling? You might find some tradeoff, but there isn&#039;t going to be any exact approach. 

Generally, no one cares about transposition.</description>
		<content:encoded><![CDATA[<p>No big assumption. Letter transposition is not spelling mistake. Different problem, different algorithm. As a matter of fact, even Google suggests «actress» as correction for acress. Quite obviously, I&#8217;d say.</p>
<p>So, like I said, you could tackle the transposition problem with an additional algorithm, however this leads you to another problem: which of two algorithms should we apply first? How do I know it was a typo and not a genuine mispelling? You might find some tradeoff, but there isn&#8217;t going to be any exact approach. </p>
<p>Generally, no one cares about transposition.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jaimie Sirovich</title>
		<link>http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/#comment-129269</link>
		<dc:creator>Jaimie Sirovich</dc:creator>
		<pubDate>Sat, 16 Apr 2011 16:46:09 +0000</pubDate>
		<guid isPermaLink="false">http://neminis.org/?p=492#comment-129269</guid>
		<description>That&#039;s a pretty big assumption right there.  In general, your approach will break on many letter transpositons, though -- and all words where the first letter changes.

Acress is a letter transposition.  Swap a for c.  It *could* be actress, but it&#039;s just as likely it&#039;s caress.  In fact, you&#039;d probably have to use word bigrams to solve that one.</description>
		<content:encoded><![CDATA[<p>That&#8217;s a pretty big assumption right there.  In general, your approach will break on many letter transpositons, though &#8212; and all words where the first letter changes.</p>
<p>Acress is a letter transposition.  Swap a for c.  It *could* be actress, but it&#8217;s just as likely it&#8217;s caress.  In fact, you&#8217;d probably have to use word bigrams to solve that one.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vincenzo Russo</title>
		<link>http://neminis.org/blog/research/text-mining/spelling-correction-with-soundex/#comment-129266</link>
		<dc:creator>Vincenzo Russo</dc:creator>
		<pubDate>Sat, 16 Apr 2011 09:35:52 +0000</pubDate>
		<guid isPermaLink="false">http://neminis.org/?p=492#comment-129266</guid>
		<description>We are talking about an algorithm for automatic spelling correction. Acress IS NOT the mispelling of caress. It is a typo, if you meant to write caress. In fact, if you type acress, this will very likely be intended as the mispelling of actress.</description>
		<content:encoded><![CDATA[<p>We are talking about an algorithm for automatic spelling correction. Acress IS NOT the mispelling of caress. It is a typo, if you meant to write caress. In fact, if you type acress, this will very likely be intended as the mispelling of actress.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

