Goslate: Free Google Translate API

goslate provides you free python API to google translation service by querying google translation website.

It is:

  • Free: get translation through public google web site without fee
  • Fast: batch, cache and concurrently fetch
  • Simple: single file module, just Goslate().translate('Hi!', 'zh')

Usage

>>> import goslate
>>> gs = goslate.Goslate()
>>> print gs.translate('hello world', 'de')
hallo welt

For romanlized writing (romanlization), batch translation, language detection, proxy support etc., please check API reference

Install

goslate support both Python2 and Python3. You could install it via:

$ pip install goslate

or just download latest goslate.py directly and use

futures pacakge is optional but recommended to install for best performance in large text translation task.

CLI

goslate.py is also a command line tool

  • Translate stdin input into Chinese in GBK encoding

    $ echo "hello world" | goslate.py -t zh-CN -o gbk
    
  • Translate 2 text files into Chinese, output to UTF-8 file

    $ goslate.py -t zh-CN -o utf-8 source/1.txt "source 2.txt" > output.txt
    

use --help for detail usage

$ goslate.py -h

What’s New

1.3.0

  • [new feature] Translation in roman writing system (romanlization), thanks for Javier del Alamo’s contribution.
  • [new feature] Customizable service URL. you could provide multiple google translation service URLs for better concurrency performance
  • [new option] roman writing translation option for CLI
  • [fix bug] Google translation may change normal space to no-break space
  • [fix bug] Google web API changed for getting supported language list

Reference

Goslate: Free Google Translate API

exception goslate.Error

Error type

class goslate.Goslate(writing=(u'trans', ), opener=None, retry_times=4, executor=None, timeout=4, service_urls=(u'http://translate.google.com', ), debug=False)

All goslate API lives in this class

You have to first create an instance of Goslate to use this API

Parameters:
  • writing

    The translation writing system. Currently 3 values are valid

  • opener (urllib2.OpenerDirector) – The url opener to be used for HTTP/HTTPS query. If not provide, a default opener will be used. For proxy support you should provide an opener with ProxyHandler
  • retry_times (int) – how many times to retry when connection reset error occured. Default to 4
  • timeout (int/float) – HTTP request timeout in seconds
  • debug (bool) – Turn on/off the debug output
  • service_urls (single string or a sequence of strings) – google translate url list. URLs will be used randomly for better concurrent performance. For example ['http://translate.google.com', 'http://translate.google.de']
  • executor (futures.ThreadPoolExecutor) – the multi thread executor for handling batch input, default to a global futures.ThreadPoolExecutor instance with 120 max thead workers if futures is avalible. Set to None to disable multi thread support

Note

multi thread worker relys on futures, if it is not avalible, goslate will work under single thread mode

Example:
>>> import goslate
>>> 
>>> # Create a Goslate instance first
>>> gs = goslate.Goslate()
>>> 
>>> # You could get all supported language list through get_languages
>>> languages = gs.get_languages()
>>> print(languages['en'])
English
>>> 
>>> # Tranlate English into German
>>> print(gs.translate('hello', 'de'))
Hallo
>>> # Detect the language of the text
>>> print(gs.detect('some English words'))
en
>>> # Get goslate object dedicated for romanlized translation (romanlization)
>>> gs_roman = goslate.Goslate(WRITING_ROMAN)
>>> print(gs_roman.translate('hello', 'zh'))
Nǐ hǎo
detect(text)

Detect language of the input text

Note

  • Input all source strings at once. Goslate will detect concurrently for maximize speed.
  • futures is required for best performance.
  • It returns generator on batch input in order to better fit pipeline architecture.
Parameters:text (UTF-8 str; unicode; sequence of string) – The source text(s) whose language you want to identify. Batch detection is supported via sequence input
Returns:the language code(s)
  • unicode: on single string input
  • generator of unicode: on batch input of string sequence
Raises:Error if parameter type or value is not valid

Example:

>>> gs = Goslate()
>>> print(gs.detect('hello world'))
en
>>> for i in gs.detect([u'hello', 'Hallo']):
...     print(i)
...
en
de
get_languages()

Discover supported languages

It returns iso639-1 language codes for supported languages for translation. Some language codes also include a country code, like zh-CN or zh-TW.

Note

It only queries Google once for the first time and use cached result afterwards

Returns:a dict of all supported language code and language name mapping {'language-code', 'Language name'}
Example:
>>> languages = Goslate().get_languages()
>>> assert 'zh' in languages
>>> print(languages['zh'])
Chinese
translate(text, target_language, source_language=u'')

Translate text from source language to target language

Note

  • Input all source strings at once. Goslate will batch and fetch concurrently for maximize speed.
  • futures is required for best performance.
  • It returns generator on batch input in order to better fit pipeline architecture
Parameters:
  • text (UTF-8 str; unicode; string sequence (list, tuple, iterator, generator)) – The source text(s) to be translated. Batch translation is supported via sequence input
  • target_language (str; unicode) – The language to translate the source text into. The value should be one of the language codes listed in get_languages()
  • source_language (str; unicode) – The language of the source text. The value should be one of the language codes listed in get_languages(). If a language is not specified, the system will attempt to identify the source language automatically.
Returns:

the translated text(s)

  • unicode: on single string input
  • generator of unicode: on batch input of string sequence
  • tuple: if WRITING_NATIVE_AND_ROMAN is specified, it will return tuple/generator for tuple (u”native”, u”roman format”)

Raises:
  • Error (‘invalid target language’) if target language is not set
  • Error (‘input too large’) if input a single large word without any punctuation or space in between
Example:
>>> gs = Goslate()
>>> print(gs.translate('Hello World', 'de'))
Hallo Welt
>>> 
>>> for i in gs.translate(['good', u'morning'], 'de'):
...     print(i)
...
gut
Morgen

To output romanlized translation

Example:
>>> gs_roman = Goslate(WRITING_ROMAN)
>>> print(gs_roman.translate('Hello', 'zh'))
Nǐ hǎo
goslate.WRITING_NATIVE = (u'trans',)

native target language writing system

goslate.WRITING_NATIVE_AND_ROMAN = (u'trans', u'translit')

both native and roman writing. The output will be a tuple

goslate.WRITING_ROMAN = (u'translit',)

romanlized writing system. only valid for some langauges, otherwise it outputs empty string

Table Of Contents

This Page