11 Jul 2015
timer5 min read
article956 words

Extracting Content from Academic Papers in PDF Format

We evaluated several toolkits designed for text extraction from PDF documents. In this article, we will share our findings and the rationale behind our final choice of CERMINE – the extraction tool developed by ADA Lab and CeON in ICM UW.

10 Apr 2014
timer2 min read
article229 words

Exception handling basics

One of the useful concepts that came with modern languages is exception handling. In a nutshell, it is the concept of catching exceptions as they occur, instead of checking the inputs and return values of every function called.

22 Mar 2014
timer3 min read
article568 words

Can you hear the sound? A brief introduction to Soundex.

Soundex algorithms are algorithms that take a word as input, and produce a string that is a phonetic representation of it.

15 Mar 2014
timer3 min read
article488 words

Floating point precision

How do you compare two numbers for equality?

14 Mar 2014
timer2 min read
article256 words

String interning

String interning is a compiler optimization that detects that the content of the two string literals are the same, and automatically points them to the same object.