<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Geek on the Loose &#187; Computer Science</title>
	<atom:link href="http://www.geekontheloose.com/computer-science/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.geekontheloose.com</link>
	<description>Just another girl-geek weblog</description>
	<lastBuildDate>Thu, 25 Mar 2010 04:17:34 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Another Ada Lovelace Day Post &#8211; CS Role Model</title>
		<link>http://www.geekontheloose.com/computer-science/another-ada-lovelace-day-post-cs-role-model/</link>
		<comments>http://www.geekontheloose.com/computer-science/another-ada-lovelace-day-post-cs-role-model/#comments</comments>
		<pubDate>Thu, 25 Mar 2010 04:17:34 +0000</pubDate>
		<dc:creator>joulie</dc:creator>
				<category><![CDATA[Computer Science]]></category>

		<guid isPermaLink="false">http://www.geekontheloose.com/?p=239</guid>
		<description><![CDATA[I posted earlier for Ada Lovelace Day about LinuxChix.org as a great resource for women in technology, and now I'm getting into the groove and want to add another post, this time about my first female role model in computer science.
Dr. Neelima Shrikhande is a professor of computer science at Central Michigan University. At the [...]]]></description>
			<content:encoded><![CDATA[<p>I posted earlier for <a title="Ada Lovelace Day" href="http://findingada.com/">Ada Lovelace Day</a> about <a title="LinuxChix - a community for women who like Linux and Free Software" href="http://www.linuxchix.org/">LinuxChix.org</a> as a great resource for women in technology, and now I'm getting into the groove and want to add another post, this time about my first female role model in computer science.</p>
<p><a title="Dr. Neelima Shrikhande - Computer Science Professor" rel="nofollow" href="http://www.cps.cmich.edu/faculty/shrikhande.shtml">Dr. Neelima Shrikhande</a> is a professor of <a title="Computer Science Department at Central Michigan University" rel="nofollow" href="http://www.cps.cmich.edu/">computer science at Central Michigan University</a>. At the time I was working on my MS, she was the only female professor in the department. I never had an indication that she views herself as a role model for the few women studying computer science there, but she is definitely a role model.</p>
<p>She's a super intelligent and focused woman for whom I have a lot of respect. According to the <a title="Dr. Neelima Shrikhande - Computer Science Professor" href="http://www.news.cmich.edu/experts/2007/09/neelima-shrikhande/">cmich.edu website</a>, she "is an authority on computer vision and artificial intelligence. She studies how to make computers capable of seeing things and understanding pictures."</p>
<p>I had her for only one class, my compiler class, but she really opened up the world of computer science for me with that class. It was a hard and life-consuming class, but I loved it more than any other class and even used what I learned for my thesis. I now have a life-long fascination with compilers and virtual machines because of that class and I still have my <a title="Dragon Book computer science compiler textbook" rel="nofollow" href="http://en.wikipedia.org/wiki/Dragon_Book_%28computer_science%29">dragon book</a>. At the time, I never thought about this, but I imagine that class was at least as hard to teach as it was to take, but she held up to the challenge seamlessly.</p>
<p><a rel="nofollow" href="http://www.cps.cmich.edu/"><img class="alignleft size-full wp-image-242" style="margin-left: 10px; margin-right: 10px;" title="Central Michigan University" src="http://www.geekontheloose.com/wp-content/uploads/2010/03/cmich-edu.gif" alt="Central Michigan University" width="120" height="75" /></a>Thanks, Dr. Shrikhande, for being such a sharp, successful role model in computer science.</p>
<p>Shameless plug: the <a title="Computer Science Department at Central Michigan University" rel="nofollow" href="http://www.cps.cmich.edu/">CMU CS department</a> is a great place for an education!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.geekontheloose.com/computer-science/another-ada-lovelace-day-post-cs-role-model/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Tale About Problems of Scale</title>
		<link>http://www.geekontheloose.com/computer-science/a-tale-about-problems-of-scale/</link>
		<comments>http://www.geekontheloose.com/computer-science/a-tale-about-problems-of-scale/#comments</comments>
		<pubDate>Sun, 31 Jan 2010 01:29:41 +0000</pubDate>
		<dc:creator>joulie</dc:creator>
				<category><![CDATA[Computer Science]]></category>

		<guid isPermaLink="false">http://www.geekontheloose.com/?p=117</guid>
		<description><![CDATA[
I earned some new bragging rights this week and had fun doing it. I love computer science!


A tale about iteration, unexpectedly large data sets, and time... 


About a month ago, a large file of data unexpectedly needed to be processed, using some pre-existing code written several years ago. This code happened to be written in [...]]]></description>
			<content:encoded><![CDATA[<p>
I earned some new bragging rights this week and had fun doing it. I love computer science!
</p>
<p>
<em><strong>A tale about iteration, unexpectedly large data sets, and time...</strong> </em>
</p>
<p>
About a month ago, a large file of data unexpectedly needed to be processed, using some pre-existing code written several years ago. This code happened to be written in Java, but this story could apply to most any commonly used programming language. It was a batch process, so speed wasn't of the utmost essence, but at the same time it shouldn't run on and on and one because other files also needed to be processed. The data set was 500K records, whereas the larger data sets normally are in the 20-30K range, so this was more than 10x the norm.
</p>
<p>
Everything seemed to be going fine. The file was read into a database and some processing had been done on it and the time had come to write out a result file. Along with results, the original programmer had wanted some statistics, so there was a quick iteration through all of the records to gather the statistics and then the results would be written into the file. It seemed straightforward and I didn't expect anything to go wrong. When things go wrong on files, my experience has been that they usually go wrong earlier on.</p>
<p><span id="more-117"></span></p>
<p><a href="http://www.geekontheloose.com/wp-content/uploads/2010/02/nicubunu_Hourglass.png" title="hourglass"><img class="alignleft size-full wp-image-119" title="nicubunu_Hourglass" src="http://www.geekontheloose.com/wp-content/uploads/2010/02/nicubunu_Hourglass.png" alt="" width="128" height="128" /></a><em><strong>Time... </strong></em></p>
<p>
I waited for the file, and then waited some more. One hour passed, and then another. Where was the file?
</p>
<p>
Normally, this process will update a status table in the database to indicate what it is currently doing, but it hadn't updated that record since the file writing phase had supposedly begun. Starting to worry that it was hung or deadlocked, I set off to investigate. The CPUs on the server showed that something was going on and it certainly wasn't idle, but they were only 35-45% busy. The memory was not paging. I checked the database and couldn't find any sign of deadlock.
</p>
<p>
Being a true lover of code, I did the next obvious step from my point of view – I read the code. I found the relevant bit of code and started tracing through that file writer step-by-step. What I discovered was that the statistics gathering phase wasn't written in a particularly optimal way. The developer had probably never imagined that such a large file would ever be processed.
</p>
<p>
<em><strong>Loop de doop... </strong></em>
</p>
<p>
What had been coded was a single loop through the records that were stored for the original file. Each record would be read, then another lookup of related data was made, and then a third lookup to map a simple translation was made. All three queries were very simple and used primary keys or indexes, so individually, they all ran very fast. Each lookup resulted in some statistics being updated, and the mapping applied nice text labels to the numbers so that humans could make sense of them. The code was O(n), so not O(1), but not so bad for the usual case.
</p>
<p>
<a href="http://www.geekontheloose.com/wp-content/uploads/2010/02/Anonymous_Praying_Mantis.png" title="praying mantis"><img class="size-full wp-image-120 alignright" style="border: 0pt none; margin: 10px;" title="Anonymous_Praying_Mantis" src="http://www.geekontheloose.com/wp-content/uploads/2010/02/Anonymous_Praying_Mantis.png" alt="" width="104" height="129" /></a>It took almost 4 hours for the statistics to be gathered for all 500K records, for a total of 1.5 million queries, which worked out to about 0.48ms for each set of three queries. For 25K, that would be about 12 minutes. Unless someone was watching carefully to see the progress, in normal batch processing, 12 minutes would not usually be enough to send up red flags and catch anyone's attention.
</p>
<p>
Of course, this discovery was just the beginning. Naturally it had to be fixed. That's where the events of this week come in to play. I had some time set aside to look at improving the code this week, so I immediately set about finding a way to let the database do the work. After a few hours of composing and testing queries, I had it! A single query that could gather all of the statistics on the 500K records including the text labels, and do it in 1 minute 41 seconds. That felt good, really good!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.geekontheloose.com/computer-science/a-tale-about-problems-of-scale/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
