A New GitHack Script

Git hack happens when site maintainers use Git to manage the source code of the website but forget to deleting  directory. By accessing  , you can easily find Git information and even the source code of the target website. Sometimes you may see 403 when you are trying to visit that URL, but that is because the access to directories is restricted. In this case, you can still access and download certain files if you know the exact URL to the file.

Why to write this script

Recently I went through a CTF organized by Wuhan University. There was a web challenge about Git hack.

When I was trying to solve that challenge with a widely-used GitHack script which was supposed to download some source codes, no files were downloaded even all of the .git files where listed on the page. Quickly I analyzed the script and found out the reason.

For that script, the process is like

  1. Retrieve the index file and extract file names and SHA1;
  2. Download the files with the SHA1;
  3. Save the downloaded files into organized directories.

Seems everything is going perfect. However, a fatal problem may happen to the step 1. If there is no files in the current repository, the index file will be empty!

The index is a single, large, binary file in <baseOfRepo>/.git/index,

which lists all files in the current branch, their sha1 checksums,

time stamps and the file name — it is not another directory with a

copy of files in it. [1]

 

After I realize this problem, I decided to write a new one with another method.

How my script works

First off, we need to know what we need in the     directory to restore files.

  • HEAD: reference to the pointer. (ref: refs/heads/master)
  • refs/heads/master: real pointer. (f43003bce4f11d9b2532b5fac0a0006126f14e2a)
  • objects: a file with the hash will be stored here in the format of  . Files are compressed with zlib.

Secondly, there are 3 types of objects in the object directories.

Commit

It specifies the Git log information and the hashes of the tree node and parent node. Example shown below.

Tree

It specifies the tree structure. The branches of a tree node can be either a blob or another tree. The entries in the tree contains filenames and the hashes their blobs. The graph[2] describes the structure. Also, an example is provided in the below.

Blob

It specifies the content of a certain file. Example here.

My script retrieves the first commit hash from the   in the step 1, and then traverse all commit hashes with the recursion algorithm. Within each round, the script will locate the tree, parses it and then find the blob and extracts it. Then, all files will be classified in to organized directories.

Script

Without installing git, you can download files.

Project: https://github.com/hazzel-cn/GitHack

Usage:

Screenshot

githack.py

gitclone.py

Reference

  • [1] https://stackoverflow.com/questions/3689838/whats-the-difference-between-head-working-tree-and-index-in-git/3690796
  • [2] https://www.jianshu.com/p/8659c9ae00cb
  • [3] https://www.jianshu.com/p/6d93d6153070

Leave a Reply