Git Companion Scripts
Introduction
Useful script collection for Git. Currently it contains file encoding validation and conversion scripts especially for multi-platform development.
They are expected to work in most Git environments with no additional software installation. Git for Windows (msysgit) is also supported.
More specifically, it's assumed the following software pieces are available:
- Perl 5.8 or later (with Encode module).
- Bourne shell (
/bin/sh
) and standard unix commands for tests.
Directories and files:
- hooks - Hook scripts for .git/hooks
- pre-commit-encoding - pre-commit script to verify file encoding.
- utils - Utility scripts
- ipconv - In-place converter of text encoding and newline characters.
- tests - Test scripts
- shunit2 - shUnit2 Unix shell unit testing framework.
- test-* - Individual test scripts.
- fixtures - Test fixtures
- txtgen.pl - Text fixture generator.
- *.txt - Text fixtures.
- runtests.sh - Script to run all tests.
Usage
hooks/pre-commit-encoding
Call this script from pre-commit hook .git/hooks/pre-commit
.
You can specify encodings allowed to be committed for each file pattern by
writing encoding
or encoding=ENCODINGS
on it in .gitattributes
.
ENCODINGS is an optional parameter of comma-separated encodings.
The script accepts emacs-like encoding notation like utf-8
utf-8-dos
utf-8-with-signature-unix
and so on. No newline character specifier
(-unix
-dos
-mac
) means 'dont care' - any newline characters will be
accepted.
If encoding
attribute without any ENCODINGS parameter is specified, default
encodings will be used. The default encodings can be specified by a script
argument or $default_encoding
variable in the script.
Some .gitattributes
examples:
# Force ascii on log files.
*.log encoding=ascii
# Specify default encodings on text files.
*.txt encoding
# Specify a macro for encodings MSVC can process.
[attr]msvc encoding=ascii-dos,utf-8-with-signature-dos
*.c msvc
*.h msvc
*.cpp msvc
When running git commit
, the script checks that each staged file has valid
encoding characters, valid newline characters, or UTF-8 BOM character prior to
the commit. If any infringement found the commit will be aborted.
% git add hoge.c
% git commit
hoge.c: utf8-unix (ascii-dos,utf-8-with-signature-dos)
Commit aborted! (Use "git commit --no-verify" to skip this)
Use git commit --no-verify
to skip checks by the pre-commit script.
Limitations:
- With older Git like v1.7.4, it won't work for an initial commit.
- It cannot handle files whose names have the sequence "
:
" (COLON SPACE).
utils/ipconv
ipconv is in-place converter of text encoding and newline characters.
Specify the output encoding with -e
option,
or modify $output_encoding
variable in the script.
It accepts emacs-like encoding notation like
utf-8
utf-8-unix
utf-8-with-signature-unix
.
Specyfing no newline character means 'dont touch.' Newline characters are not modified.
It creates backup files with suffix .orig
.
License
tests/shunit2
is taken from shUnit2
and licensed under LGPL.
Other files are licensed under MIT.
Copyright (c) 2012 Takeshi Yaegashi.