/tate

🌍 Convert everything to ASCII

Primary LanguageRubyMIT LicenseMIT

Tate ✍️

Build Status Coveralls Maintainability Downloads Gem Version

Tate converts accented characters and transliterates non-latin scripts to their closest ASCII equivalent.

Tate is a productivity tool, it behaves like a standard Unix application and can be chained with other Unix commands. It reads from standard input and writes to standard output. You can use it either as a commandline utility or a library.

Examples

Let's say you have a French sentence with a lot of weird characters and you want to convert it into ASCII in the most representative way. You can use:

echo 'Le cœur de la crémiére' | tate  #=> Le coeur de la cremiere

Or some Bulgarian text you can't read:

echo 'Здравей!' | tate --lang=bg  #=> Zdravey!

Set language using lang option for custom filters, e.g. German:

echo 'Von großen Blöcken haut man große Stücke.' | tate --lang=de

Letters ö, ü and ß will be transliterated based on German transliteration rules:

Von grossen Bloecken haut man grosse Stuecke.

Language specific punctuation will be converted to closest ASCII equivalent.

For example, in Catalan, notice how the quotes (cometes franceses) and the interpunct (punt volat) are transliterated:

«Dóna amor que seràs feliç!». Això, il·lús company geniüt, ja és un lluït rètol blavís d’onze kWh.
"Dona amor que seras felic!". Aixo, il-lus company geniut, ja es un lluit retol blavis d'onze kWh.

Installation

Add this line to your application's Gemfile:

gem 'tate'

And then execute:

$ bundle

Or install it yourself as:

$ gem install tate

Usage

Ruby Library

require 'tate'
Tate::transliterate('Zəfər', language='az')  #=> Zefer

Commandline Utility

Usage: tate [options]
-l, --lang=[LANGUAGE]            Set language for custom filters
-h, --help                       Show this message
-v, --version                    Show version

Interactive Mode

If you call tate without providing any arguments, it will expect you to provide input using standard input (keyboard). After you are done typing you can use cmd + D to trigger EOL (End of Line) and the result will printed in the next line.

Standard Streams

You can pipe the output of another command into tate.

curl gov.bg/bg | tate --lang=bg > index.html

Language Support

There are custom filters for:

Azeri, Bulgarian, Catalan, French, German, Hungarian, Polish, Romanian, Portuguese, Spanish, and Vietnamese.

The following languages are known to work (w/o custom filters):

Croatian, Czech, Danish, Esperanto, Estonian, Finnish, Icelandic, Latvian, Lithuania, Norwegian, Scottish, Slovak, Slovenian, Swedish, Turkish, and Welsh.

What's next?

Russian, Irish, Arabic, and Yoruba.

Is it any good?

Yes.

Support

This gem is tested against the following Ruby versions:

  • 3.2.2 (stable)
  • 3.1.4 (stable)
  • 3.0.6 (security maintenance)
  • 🪦 2.7.8 (end of life)

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b add-irish-support)
  3. Commit your changes (git commit -am 'Add Irish language support')
  4. Push to the branch (git push origin add-irish-support)
  5. Create a new Pull Request

Custom Filters

You can add custom language filters under lib/rules directory.

Donations ❤️

You can donate me at Liberapay. Thanks! ☕️

Trivia

tate is short for transliterate.

Nobody has time to type transliterate in the terminal.

License

Copyright © 2016-2023 Kerem Bozdas

This project is available under the terms of the MIT License.