Thursday, August 21, 2008

Russian OCR Magic

For the past few of months. Give or take a year. I've been looking for a good OCR application to pull Russian text out of images. Finally, I can say that I've found one that can help me complete my translation project for this year. I have no doubt that tesseract project can do this for me, just at the moment I can see it has bugs and glitches I can't deal with. I was not happy with tesseract outcome so I went on to search for bigger better things and wallah I found an application that was written by Russian scientists. It has great result compared to what I've found previously. It's called CuneiForm and it's all open source. You can download the zip and run it on windows or in wine. If you really want to get down and dirty there's also source. As of this writing Cuneiform is in version V.12 It has support of 20 languages: English, German, French, Spanish, Italian, Portuguese, Dutch, Russian, Mixed Russian-English, Ukrainian, Danish, Swedish, Finnish, Serbian, Croatian, Polish and others. Enjoy.


Watch these videos to know how to translate text in images:

how to install cuneiform on windows, if you see question mark instead of Russian text in cuneiform you will have to change your locales or just wing it:
VIDEO screencast.com/t/wrST2VB3
how to extract Russian text out of images into text files:
VIDEO screencast.com/t/WFJVpHvS
how to translate your newly extracted Russian text via free online tools(eg. translation2.paralink.com,google.com/translate):
VIDEO screencast.com/t/bDIhI6XPq

I hope this helps someone, somehow. I'm thinking about using Python's Pyrex or something else more feasible to automate this task and future tasks for me. Thanks. -A

Thursday, August 14, 2008

Python STDOUT Colors Script

This is a great script I use for debugging and/or general stdout colorization when working with python.If you run it from console with no parameters it loops through stdout colors. Displaying them with the string that represents that color. Notice: Some color codes may come out different than what they appear. It's very useful to import inside your other scripts and print your output in color. I though it was great. Makes for debugging large amounts of data a snap. Maybe someone else will find it useful as well. Enjoy!

This script is also located at:
code.google.com/p/python-stdout-colors/

Download PY!
Tell me what you think, leave a comment.

Runnable from terminal:
chmod +x stdout_colours.py
python stdout_colours.py

Use it in your code like this:

self.soc.write(["printing','a','list'],"red")
self.soc.write("printing a string","green")
self.soc.write({"printing":"dictionary","testing":"fun"},'blue')
self.soc.write(("printing","a","tuple"),'yellow')

Add it to your functions like this:

import stdout_colours

class some_class(object):
def __init__(self):
self.testing="fun"
self.func_me_color="white_on_blue"
self.soc=stdout_colours.stdout_colors()
self.soc.me_him(['ENTER:',__name__],self.func_me_color)
self.soc.write("doing something:","red")
self.do_something()
self.soc.me_him(['EXIT:',__name__],self.func_me_color)

def do_something(self):
self.soc.me(['ENTER:',__name__],self.func_me_color)
self.soc.write("doing something else:","green")
self.do_something_else()
self.soc.me(['EXIT:',__name__],self.func_me_color)

def do_something_else(self):
self.soc.me_him(['ENTER:',__name__],self.func_me_color)
self.soc.write(['testing','is',testing],"yellow")
self.soc.me_him(['EXIT:',__name__],self.func_me_color)


EDIT: I actually like to use this over print when I deal with terminal/console apps. Much easier to tell what is going on when text is scrolling by so fast.
I hope this helps someone. Leave a comment. Enjoy.

Sunday, August 10, 2008

PHP MVC Framework

Are you looking to make web applications in PHP? Checkout Codeigniter. It will save you some time. Just as it has for me in the last few years. I have to say this is the best MVC I've found for PHP language. Visit the forums and see for yourself what power this framework possesses. You can do with a few lines of code what you would have to do with 50 in another framework. All you logic is in the controllers and db stuff is in models. While all the presentation layer stuff is in views. On top of all this you can breakup your views into sections and call views from views and controllers from controllers. With add on libs, of course. If you looking for good libs to add to CI, then checkout Modular Extensions. It gives you the power to call controllers from controllers. Separate code into callable modules. There are plenty of login libs in the wild as well. I say for each his own. simplelogin is perfect for a simple web app, with minor modifications. Khaos is another good one. If you are looking to use smarty in your web apps you can add a lib wrapper for this as well. This way you can use CI Views with smarty style syntax. Get the best of both worlds. I hope this helps you.

Friday, August 8, 2008

My .screenrc

Hello and Welcome to my Blog. This is my first post and I would like to share my .screenrc file with you.
This .screen file is located in my homedir. With this screenrc you can navigate through different tabs by pressing ` (backtick located to the left of 1 on most keyboards) and then the tab you wan to navigate to. Originally you would press ctrl+a and tab you want to navigate to. Use ` (backtick) shift+A to rename your current tab. You can also google for more screen goodies since I don't cover all of them here.

EDIT: If you copy and paste the below text into a file and call it .screenrc you will get errors. You can get the file on pastebin here and paste it into a text editor (nano/pico/vi/vim/gedit/kwrite/kate/etc...)


.screenrc




vbell off
startup_message off
# create a status line at the bottom of the screen. this will show the titles and locations of
# all screen windows you have open at any given time
hardstatus alwayslastline
hardstatus string '%{gk}[ %{G}%H %{g}][%= %{wk}%?%-Lw%?%{=b kR}(%{W}%n*%f %t%?(%u)%?%{=b kR})%{= kw}%?%+Lw%?%?%= %{g}][%{Y}%l%{g}]%{=b C}[ %m/%d %c ]%{W}'
# bind some function keys (k1 == F1, etc) for fast navigation through screen windows
#
#
bindkey -k k2 prev
bindkey -k k3 next
# This changes the default control character (normally ^a) to something else
# (i do this to ease the use of nested screens so command characters dont conflict with each other)
escape `` #"^Ff"
# set the ssh-agent on my workstation to forward my ssh key through my screen windows
#. .keychain/$HOSTNAME-sh
# this will log screen errors to a daily log under the speficied directory
logfile /home/$USER/logs/screen_%y-%m-%d_%0c

screen -t irc /bin/sh -c "if [ $USER != 'root' ]; then irssi -c niven.freenode.net; fi;bash"
screen -t sudoscrn /bin/sh -c "sudo screen -c '/etc/screenrc';bash"
screen -t luser@box /bin/sh -c "ssh luser@box;bash"
screen -t mysql /bin/sh -c "if [ $USER != 'root' ]; then mysql else mysql -u root -p; fi;bash"
screen -t python /bin/sh -c "python"
screen -t bash
screen -t bash
screen -t bash
screen -t arpwatch /bin/sh -c "arpwatch;bash"
screen -t top /bin/sh -c "top;bash"
#shelltitle "$ |bash"

# these last 2 lines are to set the focus on startup (which screen window we look at when screen finishes starting)
focus
select 1



I like to run screen within screen, so I have a seperate screenrc loaded from /etc/screenrc via -c flag

Screen is definitely a life saver when you are working in terminal and need to suddenly close it for some reason.
To list all active screen sessions type: screen -ls
To reattach to a detached session find the the session first with the command above and then type: screen -r ####.pts-#
Note: replace #'s with your session number
To reattach to an already active screen session and have it close where ever it's open and reattach in the current terminal type: screen -raAD
I use this alot. More than screen -r

If you have a KeySpan you can also use screen to connect to serial devices.
If you're connecting to a headless machine with a KeySpan you can just type: screen /dev/ttyUSB0 115200

Note: You may need to change your baud rate and/or parity bit based on the device you connect to.

Or you can just use minicom for this. But I prefer screen more personally.

Enjoy, I hope this helps someone.