list_put() using commits were added. See Test Cases.
write_file() in Utils.cpp. This affected WSL and Linux users only. A memory-leak bug was fixed for merge() in Repository.cpp. As a result, program skeleton and demo programs were updated. See Resources to download the latest version.
pa2-diagrams.pdf was updated. See Resources to download the latest version.
Tester.cpp. A bug that prevents files in conflict from saving when doing merge was fixed in Repository.cpp. As a result, program skeleton and demo programs were updated. See Resources to download the latest version.
We value academic integrity very highly. Please read the Honor Code section on our course webpage to make sure you understand what is considered as plagiarism and what the penalties are. The following are some of the highlights:
The objective of this assignment is to provide you with practice on structures, pointers, linked lists and dynamic memory allocation and deallocation. Upon completion of this assignment, you should be able to:
In this assignment, you are going to implement a version control system (VCS) that resembles Git. Our version is basically a simplified version of Git, and thus it is named Gitlite . You may have heard of GitHub, which is an web service that is based on Git. If you have submitted code to GitHub, then you must have used Git (although you may not be aware of this).
People use version control system to keep track of the changes of files they wish to maintain. Essentially, version control systems maintain different versions of the same file that exist at various time, or different versions of the same file that exist simultaneously. Version control systems are an integral part in software development. In the upcoming project of this course, you may want use Git (and GitHub) to facilitate collaboration and versioning. Watch this video to know more on version control systems.
It would be helpful to be familiar with Git concepts before you start working on this assignment. You can refer to the official guide from Git or any Git introduction online. But here we provide a brief introduction to essential concepts.
A repository is the database of a version control system. A repository tracks a file if it maintains the version of the file. The tracked files of the repository are the files that are tracked by the repository. Untracked files are the files that are not maintained by the repository.
A commit records a snapshot, or the state of the repository. Each commit therefore records the content of the tracked files of the repository when the commit was made. Initially, the repository is empty and hence there exists an initial commit that tracks no files.
Files can be changed in three ways: a new file is added, the content of an existing file is modified, and a file is deleted. To record changes to the repository, the changes are staged: For adding a new file or modifying an existing file, the file is added to the staging area, where the current content is saved. Bear in mind that the staging area only contains the version of the file when it was staged. If a file is modified after staging, the changes are not recorded in the repository unless the file is staged again. For deleting a file, it is removed from the tracked files.
When a new commit is made, only the staged changes are recorded. For other tracked files, they inherit from the previous commit. The previous commit becomes the parent commit of the new commit. As a result, the repository maintains a history of commits where files can be restored from, which is referred as checkout.
A sequence of commits is referred as a branch. A branch is associated with a head commit. The commit sequence of the branch is obtained by following the parent commits starting from the head commit of the branch. A branch can be checked out. This restores all files from the head commit of that branch and change the tracked files to be those tracked by the head commit as well.
The history of changes recorded by different branches can be merged together. This is essential in software development: Software developers work on the project by committing to their own branch. Ultimately, they would merge their work into the master branch which has the combined product of the developers.
To support versioning, we need an efficient way to determine whether two versions are different. In other words, we wish to generate "fingerprints" for files such that we can check the fingerprints to determine whether two file versions are different.
To achieve this, we compute the SHA-1 value of file contents. SHA-1 is a cryptographic hash function that produces a hash value with 160 bits. Usually, we represent the hash values using 40-digit hex strings (since a hex digit represents 4-bit binary value):
SHA-1 for "Gitlite": 4ed6b233fe07f0b956e3460e9774f7bfd25d5b7c
SHA-1 for "gitlite": ddf639cf894badea22b7b7370dc677dedd404ff7
Cryptographic hash functions have the property that it is difficult to generate two inputs that have the same hash value. So we can say that given two file versions, the probability that their hash values are equal is one divided by two to the power of 160, which is negligible. Thus, we can simply check whether two hex strings are the same to determine whether two files versions are identical.
A third-party library, TinySHA1, is utilized to compute SHA-1 values. You don't need to understand and use this library as helper functions have been implemented to use this library (See Compilation to see how we can compile Gitlite with external libraries).
Gitlite requires some data structures representing the state of the repository to be maintained persistently across different executions of Gitlite. For example, one may use Gitlite to make a commit. When he runs Gitlite the next time, the commit should not be lost.
To achieve this, Gitlite maintains a .gitlite directory which stores the data structures of Gitlite (See Appendix for the structure of the .gitlite directory). Helper functions have been implemented to perform all filesystem operations which use the filesystem feature from C++17. A third-party library, cereal is used for serializing objects upon filesystem operations (See Compilation to see the compilation requirements of Gitlite).
Since Gitlite is a simplified version of Git, there exists many subtle difference with Git. These will be mentioned when the Gitlite commands to implemented are described. If you are a sophisticated Git user, pay attention to these differences and make sure you implement the expected behavior of Gitlite.
pa2/
├── auto-testing/
│ ├── src/
│ └── tests/
├── include/
│ ├── TinySHA1.hpp*
│ └── cereal/
├── Commit.cpp*
├── Commit.h*
├── Makefile
├── Repository.cpp*
├── Repository.h*
├── Tester.cpp*
├── Tester.h*
├── Utils.cpp*
├── Utils.h*
├── gitlite.cpp*
├── gitlite.h*
└── main.cpp*
Commit.h: defines the data structure of Gitlite.
Commit.cpp: contains the linked list operations to implement in Part 1, and some helper functions for Part 2.
gitlite.h/.cpp: contains the Gitlite commands to implement in Part 2.
Repository.h/.cpp: contains the wrapper functions of the Gitlite commands that handle most filesystem operations, especially those that deal with the .gitlite directory.
Utils.h/.cpp: contains general helper functions you may use.
Tester.h/.cpp: the automated testing module.
auto-testing/: the directory of the automated testing environment. Contains test cases and related files.
include/: the directory that contains the two third-party libraries used in Gitlite. You do not need use them directly.
TinySHA1 is employed to compute SHA-1 hash values for files and commits. Helper functions have been implemented in Utils.cpp that invokes functions from this library.
cereal is employed to serialize objects when writing objects to file (and de-serialize objects when reading from file).
Makefile: the Makefile to build Gitlite (See Compilation).
You only need to submit Commit.cpp and gitlite.cpp (See Submission & Grading).
The following data data structures are defined in Commit.h and are essential in Gitlite.
Blob object as the common data structure to represent files and branches.
struct Blob {
string name;
string ref;
Commit *commit = nullptr;
Blob *prev = nullptr, *next = nullptr;
};
name stores the filename, ref stores the SHA-1 hash value in hex string. This hash value is used by the wrapper functions to refer to the saved copy of the file in the repository (See Appendix for the directory structure of .gitlite). commit is unused.
name stores the name of the branch, commit points to the head commit of the branch. ref is unused. Blob objects do not own commits, so no need to free the commit when freeing a Blob.
prev and next are pointers to the previous and next element in the linked list.
List is a circular linked list with sentinel node of Blob objects. Implementation of linked list operations is the task of Part 1.
struct List {
Blob *head = nullptr;
};
Blob objects in a linked list are always either represents a file or a branch, but not both.
Commit objects are abstractions of commits.
struct Commit {
string message;
string time;
string commit_id;
Commit *parent = nullptr, *second_parent = nullptr;
List *tracked_files = nullptr;
};
message is the commit message.
time is the time string of when the commit was made.
commit_id is the commit id that uniquely identifies the commit. We compute the commit using the SHA-1 hash value of the commit message and time. This is done by a helper function.
parent points to the parent commit. It is nullptr only if it is the first commit, referred as the initial commit, of the repository.
second_parent points to the second parent commit, which exists only for the commit created after merging two branches. Otherwise, it is nullptr.
tracked_files is the linked list of Blob objects representing the files tracked by the repository when this commit was made. Different commits do not share same Blob objects. If a tracked file is identical in two commits, the two Blob objects in each of the commit have the same ref value.
Implement the following helper functions in Commit.cpp that operate on linked lists. Since linked list operations are essential in Gitlite, ZINC testcases are provided for you to check the correctness of linked list operations before proceeding to Part 2.
List *list_new();
void list_push_back(List *list, Blob *blob);
list and blob are not nullptr.
Blob *list_find_name(const List *list, const string &name);
nullptr if no blobs were found.
list is not nullptr.
Blob *list_put(List *list, const string &name, const string &ref);
Blob *list_put(List *list, const string &name, Commit *commit);
< and >. Basically, if a < b, then a should appear before b in the linked list. (Blobs in Example 2 in Appendix follow lexicographic order)
list and commit are not nullptr.
bool list_remove(List *list, const string &target)
true if a blob with the target name was found and removed. false if no blobs with the target name were found.
list is not nullptr.
int list_size(const List *list);
list is not nullptr.
void list_clear(List *list);
list is not nullptr.
void list_delete(List *list);
list is not nullptr.
list is guaranteed not be used in subsequent operations.
void list_replace(List *list, const List *another);
list and another are not nullptr.
list and another do not refer to the same linked list.
List *list_copy(const List *list);
list is not nullptr.
Implement following Gitlite commands by completing the corresponding functions.
These functions are called by wrapper functions already implemented in Repository.cpp. Unless specified, the wrapper functions handle all filesystem operations to the .gitlite directory. For some commands, you need manipulate files in the current working directory. See Utils.cpp for related helper functions.
When Gitlite is started, the wrapper functions would read from the .gitlite directory and construct the following variables that are also passed as parameters to the functions you need to implement. When the functions return, wrapper functions would write the changes of these variables to the .gitlite directory. Therefore, a primary task of this part is to manipulate the following variables according to the Gitlite commands. The following variables, combined, can be considered as the state of the repository.
List *tracked_files: The currently tracked files of the repository, and their saved content.
name field holds the name of the file. The ref field records the hash of the saved file that is used by wrapper functions to retrieve the actual content of the saved file. The commit field is unused.
List *staged_files: The staging area of the repository. Contains the saved content of changed tracked files since the head commit of the repository.
name field holds the name of the file. The ref field records the hash of the saved file that is used by wrapper functions to retrieve the actual content of the saved file. The commit field is unused.
List *branches: The branches that exist in the repository.
name field holds the name of the branch. The commit field points to the head commit of the branch (the latest commit of the branch). The ref field is unused.
Blob *current_branch: The current branch of the repository.
branches.
Commit *head_commit: The head commit of the repository. Also the head commit of the current branch (Since Gitlite does not support detached head). Also the latest commit made on the current branch.
current_branch->commit.
head_commit->tracked_files contains the tracked files and their saved content at the head commit of the repository.
The wrapper functions also read from the current working directory to construct the following variable. This variable is also passed as a parameter to the functions you need to implement. When the functions return, the content of this variable is ignored.
List *cwd_files: The files in the current working directory (CWD).
name field holds the name of the file. The ref field and commit field are unused.
In particular, the wrapper functions use the linked list operations you have implemented in Part 1 to build linked lists. The above linked lists are created using list_new(), and populated using list_put(), such that names (filenames or branch names) are sorted in ascending lexicographic order. You may want to verify the linked list operations before you proceed.
void init(Blob *¤t_branch,
List *&branches,
List *&staged_files,
List *&tracked_files,
Commit *&head_commit);
gitlite initbranches, staged_files, tracked_files.
initial commit. Set the time string and compute the hash. This commit tracks no files (initialize commit->tracked_files as well) and has no parents.
master and set it as the current branch. Add the initial commit to the branch. Set the head commit of the repository as well.
Utils.cpp for some useful functions.
gitlite.cpp.
head_commit to the global list of commits, create a .gitlite directory in the current working directory, and save the newly initialized data structures to this directory..gitlite directory would fail with the message: A Gitlite version-control system already exists in the current directory. This is handled by the wrapper function..gitlite directory. If Gitlite cannot find an initialized .gitlite directory, it would print Not in an initialized Gitlite directory. and terminates immediately. This is handled by the wrapper function.master.
bool add(const string &filename,
List *staged_files,
List *tracked_files,
const Commit *head_commit);
gitlite add [filename]true.
false. (This happens when a modified file was added, then restored to original version, and added again. Since the file was restored, it should be removed from the staging area.)
tracked_files stores the currently tracked files of the repository and their saved content.head_commit->tracked_files contains the tracked files of the repository and their saved content at the head commit of the repository.Utils.cpp for some useful helper functions.list_put() to add files to the list such that the name follows ascending lexicographic order.filename exists in the current directory. Otherwise, the wrapper function would report File does not exist..gitlite directory to save a copy of the file and update the index for staging area. The saved copies are identified by the hash value, such that only one copy is save for different file versions with the same content.
bool commit(const string &message,
Blob *current_branch,
List *staged_files,
List *tracked_files,
Commit *&head_commit);
gitlite commit [message]No changes added to the commit. and return false.
true.
Utils.cpp for some useful helper functions.true, the wrapper function would add the commit pointed by head_commit to the global list of commits, and make filesystem operations to the .gitlite directory to write changes to filesystem.
bool remove(const string &filename,
List* staged_files,
List *tracked_files,
const Commit *head_commit);
gitlite rm [filename]No reason to remove the file. and return false. Otherwise, return true.
restricted_delete(filename) in Utils.cpp for deleting a file from filesystem.true, the wrapper function would make filesystem operations to the .gitlite directory to remove index for staging area.
void log(const Commit *head_commit);
gitlite log
===
commit fd0efd44873699357e2a452ec8f17e79ac965e4c
Date: Thu Jul 22 22:23:56 2021
Another commit message
===
commit 61db0df36b222c7acc910b2dec40e5461e4041cb
Date: Thu Jul 22 22:23:31 2021
A commit message
===
commit 0ed4cfc32f3478a10c755c5c8cca980b14c54f17
Date: Thu Jul 22 22:22:50 2021
initial commit
commit_print(commit) in Commit.cpp for displaying information of a commit.
void status(const Blob *current_branch,
const List *branches,
const List *staged_files,
const List *tracked_files,
const List *cwd_files,
const Commit *head_commit);
gitlite status
=== Branches ===
another-branch
*master
=== Staged Files ===
staged_file.txt
=== Removed Files ===
gone.txt
=== Modifications Not Staged For Commit ===
changed.txt (modified)
file1.txt (modified)
file2.txt (deleted)
file3.txt (modified)
file4.txt (deleted)
missing.txt (deleted)
=== Untracked Files ===
files.untracked
*.
(modified) for case 1 and 2. Append (deleted) for case 3 and 4.
* of current branch does not count.cwd_files stores the files located in the currently working directory. Use this to find out untracked files.get_sha1(filename) returns the hash of the file version in CWD.is_file_exists(filename) checks whether a file exists in CWD.tracked_files to see whether each file matches one of the 4 conditions.list_put(). Filenames and branch names follow ascending lexicographic order if list_put() is implemented correctly.
bool checkout(const string &filename,
Commit *commit);
gitlite checkout -- [filename] or gitlite checkout [commit_id] -- [filename]commit is nullptr, then the wrapper function cannot find the commit with the commit id. Print No commit with that id exists. and return false.
File does not exist in that commit. and return false.
true.write_file(filename, ref) for writing the content referred by ref to the file in filesystem.61db0df36b222c7acc910b2dec40e5461e4041cb can be abbreviated as 61db0df or simply 61, if no other commits with that same prefix exist. The shortest abbreviation has 2 characters.
bool checkout(const string &branch_name,
Blob *¤t_branch,
const List *branches,
List *staged_files,
List *tracked_files,
const List *cwd_files,
Commit *&head_commit);
gitlite checkout [branch_name] A branch with that name does not exist. and return false.
No need to checkout the current branch. and return false.
There is an untracked file in the way; delete it, or add and commit it first. and return false.
true.restricted_delete(filename) for deleting a file from filesystem. write_file(filename, ref) for writing the content referred by ref to the file in filesystem..gitlite directory to update the index for staging area and tracked files.
bool reset(Commit *commit,
Blob *current_branch,
List *staged_files,
List *tracked_files,
const List *cwd_files,
Commit *&head_commit);
gitlite reset [commit_id]commit is nullptr, then the wrapper cannot find the commit with the commit id. Print No commit with that id exists. and return false.
There is an untracked file in the way; delete it, or add and commit it first. and return false.
true.restricted_delete(filename) for deleting a file from filesystem. write_file(filename, ref) for writing the content referred by ref to the file in filesystem.
Blob *branch(const string &branch_name,
List *branches,
Commit *head_commit);
gitlite branch [branch_name]A branch with that name already exists. and return nullptr.branches.master branch.nullptr, the wrapper function would make filesystem operations to the .gitlite directory to write the branch information.
bool remove_branch(const string &branch_name,
Blob *current_branch,
List *branches);
gitlite rm-branch [branch_name]A branch with that name does not exist. and return false.
Cannot remove the current branch. and return false.
true.true, the wrapper function would make filesystem operations to the .gitlite directory to remove information of the branch.
bool merge(const string &branch_name,
Blob *¤t_branch,
List *branches, List *staged_files,
List *tracked_files,
const List *cwd_files,
Commit *&head_commit);
gitlite merge [branch_name]A branch with that name does not exist., and return false.
Cannot merge a branch with itself. and return false.
You have uncommitted changes. and return false.
initial commit --- c1 --- c2 --- c3 --- c4 (head of master)
\
--- n1 --- n2 (head of new)
The split point of the master branch and the new branch is the commit c2.
c4 are c3, c2, c1 and the initial commit.
c4 and n2 are c2, c1 and the initial commit.
c2 is the only latest common ancestor. Hence it is the split point of master and new.
Commit.cpp. This will be graded separately:
Commit *get_lca(Commit *c1, Commit *c2);
For two given commits c1 and c2, return the latest common ancestor.
Given branch is an ancestor of the current branch. and return true.
cwd_files, if there exists a file that is not tracked in the head commit of the current branch but tracked in the head commit of the given branch, print There is an untracked file in the way; delete it, or add and commit it first. and return false.
Current branch fast-forwarded. and return true. If it failed, return false.
cwd_files, if there exists a file that is not tracked in the head commit of the current branch but tracked in the head commit of the given branch, print There is an untracked file in the way; delete it, or add and commit it first. and return false.stage_content(filename) explicitly to modify the index in the .gitlite directory.
stage_content(filename) explicitly to modify the index in the .gitlite directory.
add_conflict_marker(filename, ref) in Utils.cpp)
<<<<<<< HEAD
contents of the file in the current branch
=======
contents of the file in the given branch
>>>>>>>
stage_content(filename) explicitly to modify the index in the .gitlite directory.
Merged [given branch name] into [current branch name]. Added October 15: Use get_merge_commit_message() to compose this message.
Encountered a merge conflict. (once is enough)
true.
true.
tracked_files of these three commits: split point, head commit of the current branch, head commit of the given branch.new into master which results the merge commit m1. The set of commit becomes:
initial commit --- c1 --- c2 --- c3 --- c4 --- m1 (head of master)
\ /
--- n1 ------ n2 (head of new)
The split point (excluding second parents) of the master branch and the new branch is c2. master branch and the new branch is n2. This is the behavior of Git.
add_conflict_marker(),
ref, like this would do: add_conflict_marker(filename, string()).add_conflict_marker() would use an empty line for the current branch version.
Since Gitlite employs the filesystem feature from C++17, it needs to be compiled with C++17 standard. Also, Gitlite needs to be compiled with the two third-party libraries.
A Makefile is provided in the skeleton files to build Gitlite with C++17 standard and the third-party libraries. Makefile and separate compilation were taught during the lab session on September 29. To build Gitlite, simply run make in the directory that contains the .cpp files and Makefile.
make
Gitlite cannot be built using the bundled GCC (version 8.1.0) with our portable version of VSCode for Windows because it has a bug with filesystem. If you are using our portable version of VSCode for Windows, you are required to download the latest portable version of VSCode for Windows here, which is bundled with the bug-fixed GCC (version 10.3.0).
To facilitate grading on ZINC, a different version of Repository.cpp and Utils.cpp, which substitute filesystem operations by memory operations, will be used to compile your ZINC submissions. As a result, your ZINC submissions are compiled with C++11 standard. So make sure you do not include any code that does not conform to C++11 standard.
restricted_delete() will make sure that Gitlite only removes a file if it is in a directory with a .gitlite directory.
After successful compilation, you would find an executable file named gitlite (or gitlite.exe for Windows) in the current directory.
Gitlite process commands by reading the command line arguments argv that are passed to main(). When no command line arguments are given, argc is 1 and Gitlite terminates immediately. You may want to modify the if branch when argc == 1 in main() to test the linked list implementation or something else.
Otherwise, Gitlite parses argv and dispatch the command by calling the wrapper function for the function you have implemented in Part 2.
To run a Gitlite command, enter ./gitlite (or .\gitlite.exe for Windows) followed by the Gitlite command and optional arguments, separated by spaces. For arguments with spaces, wrap them in a pair of quotes:
# For macOS and Linux
$ ./gitlite init
$ ./gitlite commit "a commit message with spaces"
# For Windows
$ .\gitlite.exe init
$ .\gitlite.exe commit "a commit message with spaces"
It is highly recommended to setup a dedicated directory for testing Gitlite.
Create a directory, say, testing:
pa2/
├── testing/
├── Commit.cpp*
├── gitlite.cpp*
├── gitlite
└── ...
Then, change the working directory to testing. If you are using VSCode, right click on testing in the directory structure that appears on the left, click "Open in Integrated Terminal". Run Gitlite, which is located in the parent directory, in the following way:
# For macOS and Linux
$ ../gitlite
# For Windows
> ..\gitlite.exe
Now you can place files in testing and run Gitlite commands to test it:
pa2/
├── testing/
│ ├── .gitlite/
│ ├── file1.txt
│ ├── file2.txt
│ ├── file3.txt
│ └── ...
├── Commit.cpp*
├── gitlite.cpp*
├── gitlite
└── ...
# For macOS and Linux
$ ../gitlite init
$ ../gitlite add file1.txt
$ ../gitlite status
$ ../gitlite commit "some message"
$ ../gitlite log
# For Windows
> ..\gitlite.exe init
> ..\gitlite.exe add file1.txt
> ..\gitlite.exe status
> ..\gitlite.exe commit "some message"
> ..\gitlite.exe log
If you want to reset Gitlite, remove the .gitlite directory and run gitlite init again.
You may want to setup shell alias to ease your testing.
Two commands have been implemented to facilitate debugging and testing:
gitlite find [commit message]
gitlite global-log
Testing by manually manipulating files and typing commands can be dull and error-prone. To better facilitate testing, Gitlite comes with an automated testing module, implemented in Tester.cpp, which does the following:
cout with the expected output specified in the test case file..gitlite directory from the current working directory before running a test case. Always run the automated testing module in a separate, dedicated directory.
In the skeleton files, there is a directory named auto-testing. This is the environment for the automated testing module. It consists of two directories:
src: Stores all files accessible by the automated testing module. Depending on the running test case, files in this directory would be copied to or compared with the ones in auto-testing.
tests: Stores all test cases files (with .in file extension) and optional include files (for example definitions.inc) that are used in test cases.
pa2/
├── auto-testing/
│ ├── src/
│ │ ├── akari.c
│ │ ├── img.png
│ │ ├── text1.txt
│ │ └── text2.txt
│ └── tests/
│ ├── basic-checkout.in
│ ├── basic-operations.in
│ ├── basic-status.in
│ ├── definitions.inc
│ └── init.in
├── Commit.cpp*
├── gitlite.cpp*
├── gitlite
└── ...
To run a test case, make sure you first change the working directory to auto-testing, then run the following command:
$ ../gitlite -t [path]
$ ../gitlite -t tests/init.in
init.in will be run.
.in extension) will be run. For example:
$ ../gitlite -t tests
All test cases (with .in extension) in tests will be run.
A test case passes if all output matches the expected output. You would see the following message:
Running test: tests/init.in
Test PASSED
Otherwise, the test case fails and you would see the output and the expected output. The differences are highlighted in the following example:
Running test: tests/init.in
Wrong output for command: log
Expected:
===
commit ([a-f0-9]+)[ \t]*\n(?:Merge:\s+[0-9a-f]{7}\s+[0-9a-f]{7}[ ]*\n)?Date: \w\w\w \w\w\w ?\d+ \d\d:\d\d:\d\d \d\d\d\d
the initial commit
Actual:
===
commit a22b278a25b176ab6da1c40eb1b26e638e54bcf9
Date: Fri Oct 1 20:07:59 2021
initial commit
Test FAILED
Notes:
a22b... and the date. You would see the regular expression ([a-f0-9]+)... in the expected output. This is normal.
$ ../gitlite -tv [path]
Here are the set of statements avaliable to use in test cases. All paths are interpreted relative to the current working directory
I FILE
Include a file. Replace this statement with the contents of FILE.
+ NAME F
Copy the contents of src/F into a file named NAME.
- NAME
Delete the file named NAME.
> COMMAND ARGUMENTS
LINE1
LINE2
...
<<<
Run Gitlite with COMMAND ARGUMENTS as its parameters.
Compare its output with LINE1, LINE2, etc.,
report error if any discrepancy is found.
The end of output is denoted by the <<< delimiter.
= NAME F
Check whether the file named NAME is identical to src/F, and report an
error if it does not.
* NAME
Check whether the file NAME does not exist, and report an error if it
does exist.
E NAME
Check whether the file or directory named NAME exists, and report an error if it
does not.
D VAR "VALUE"
Define the variable VAR to have the literal value VALUE.
The variable can be used by ${VAR}.
I tests/definitions.inc # include provided definitions in ./tests/definitions.inc
> init # run init command
<<< # no expected output for init command
+ sample.txt sample_src.txt # copy ./src/sample_src.txt to ./sample.txt
> add sample.txt # run add command
<<< # no expected output for add command
> commit "add sample"
<<<
> log # run log command
=== # beginning of expected output of log command
${COMMIT_HEAD} # ${COMMIT_HEAD} will be replaced by the value of variable COMMIT_HEAD
add sample
===
${COMMIT_HEAD}
initial commit
<<< # end of expected output of log command
definitions.inc, which contains some useful variable definitions of regular expressions:
DATE matches time strings in commit log. For example,
Date: Wed Oct 16 23:26:06 2021
COMMIT_HEAD matches commit log headers, including the date string. For example:
commit cf992c00717703ae7c23e799c01e0c3467b2305e
Date: Wed Oct 6 23:26:06 2021
The commit id (cf992c00...) can be captured and reused as variables. ${1} would be defined as the first commit id that appears in the output, ${2} as the second commit id, and so on. This is useful for testing commands that requires commit ids. The results are available until the next command (which would discard all previous ${n}). See below for an example.
ARBLINE matches an arbitrary line (may be empty).
ARBLINES matches any number of arbitrary lines (including no lines).
===
commit cf992c00717703ae7c23e799c01e0c3467b2305e
Date: Wed Oct 6 23:26:06 2021
add sample
===
commit 3ee35c73c836a209ee0c4e3f4655431016a9163f
Date: Wed Oct 6 23:24:58 2021
initial commit
... # previous contents omitted
> log
===
${COMMIT_HEAD} # first commit id in the output, saved to ${1}
add sample
===
${COMMIT_HEAD} # second commit id in the output, saved to ${2}
initial commit
<<<
D c2 "${1}" # c2 is the commit id of the commit "add sample"
D c1 "${2}" # c1 is the commit id of the initial commit
> checkout ${c2} -- sample.txt # checkout sample.txt from the initial commit
<<<
> status
=== Branches ===
\*master # escape '*'
=== Staged Files ===
=== Removed Files ===
=== Modifications Not Staged For Commit ===
something_modified.txt \(modified\) # escape '(' and ')'
=== Untracked Files ===
${ARBLINES}
<<<
*, ( and ), by adding \ before them, since they are reserved characters in regular expressions.
Deadline: 22 October 2021 Friday HKT 23:59.
You may earn 8% course grade for each PA via Automated Grading on the ZINC Online Submission System. Please check here for a usage overview of ZINC. Compress Commit.cpp and gitlite.cpp into a zip file (the name of the zip file does not matter) for submission to ZINC.
Updated October 3: Test cases for linked list operations are available in ZINC. See Test Cases for details. The test cases will be run when you submit on ZINC. 1 point is given for each test case. The current scores do not reflect what you will get. A different but similar set of test cases will be used in final grading.
Updated October 8: Test cases for Part 2 are available in ZINC. Some test cases (including those for Part 1) also check for memory leak. See Test Cases for details.
To be released.
Before the program terminates, you need to ensure you have deallocated all dynamic memory you have allocated in the entire execution of the program.
Memory leak is checked by using the sanitizer, invoked by the -fsanitize=address,leak,undefined option (See related documentation here). Some revealed test cases on ZINC also check for memory leak (See Test Cases).
Although some test cases on ZINC already check for memory leak, you may still want to check for memory leak yourself. However, the above option does not work in Windows minGW g++. Instead, you can do so by remote connecting using SSH to the machines in the Linux Lab (Lab 2). The following shows you how to do so by using the Virtual Barn. You may want to adjust the steps accordingly if you are familiar with using SSH on your own computer.
PuTTY. There is a shortcut on your Desktop. Enter a lab 2 machine (csl2wkXX.cse.ust.hk, where XX=[01 ... 53], for example csl2wk13.cse.ust.hk) as the Host Name. Note: If that server doesn't work, you may try other servers by changing XX.
pwd (Print Working Directory) to verify this. It should show /homes/your_login
FileZilla from Start Menu. Click "File" -> "Site Manager" -> "New Site". Choose SFTP for protocol according to the screenshot below. Enter csl2wk13.cse.ust.hk (or the server you used in the previous step) as Host. Choose "Ask for password" for the Logon Type.
/homes/your_login. Also, you need to use this Makefile in the machines in Linux Lab. This Makefile includes the sanitizer option on compilation. You can close FileZilla after doing so.
PuTTY, and compile your program using the make command:
make
./gitlite
Grading scheme on memory leak penalty will be released later.
Here are some test cases that are available on ZINC.
The following test cases test implementations of linked list operations.
list_delete() is not called for the following test cases.
ListNew: Calls list_new() to create a list. The data fields in the sentinel are checked.
ListPushBackOne: Calls list_push_back() to add one blob at the back. Both forward and backward orders of the linked list are checked.
ListPushBackMultiple: Calls list_push_back() to add multiple blobs at the back. Both forward and backward orders of the linked list are checked.
ListFindNameFound: Calls list_find_name() to find a blob with the given name. The blob can be found. The return values are checked.
ListFindNameNotFound: Calls list_find_name() to find a blob with the given name. The blob cannot be found. The return values are checked.
ListPutNewOne: Calls list_put() to add a blob to the list. The return value is checked. Both forward and backward orders of the linked list are checked.
ListPutNewMultiple: Calls list_put() to add multiple blobs to the list. The return values are checked. Both forward and backward orders of the linked list are checked.
ListPutNewMultipleCommit: Same as ListPutNewMultiple but uses commits instead.
ListPutUpdateOne: Calls list_put() to update a blob in the list. The return value is checked. Both forward and backward orders of the linked list are checked.
ListPutUpdateMultiple: Calls list_put() to update multiple blobs to the list. The return values are checked. Both forward and backward orders of the linked list are checked.
ListPutUpdateMultipleCommit: Same as ListPutUpdateMultiple but uses commits instead.
ListRemoveNotFound: Calls list_remove() to remove a blob with the given name. The blob cannot be found. The return values are checked. Both forward and backward orders of the linked list are checked.
ListRemoveOne: Calls list_remove() to remove a blob with the given name. The blob can be found and removed. The return value is checked. Both forward and backward orders of the linked list are checked.
ListRemoveAll: Calls list_remove() to remove all blobs from the list. The return values are checked. Both forward and backward orders of the linked list are checked.
ListSize: Calls list_size() to get the size of several lists.
ListClearEmpty: Calls list_clear() on an empty list. Both forward and backward orders of the linked list are checked.
ListClearNonEmpty: Calls list_clear() on an non-empty list. Both forward and backward orders of the linked list are checked.
ListReplaceEmptyByEmpty: Calls list_replace() to replace an empty list by an empty list. Both forward and backward orders of the linked list are checked.
ListReplaceEmptyByNonEmpty: Calls list_replace() to replace an empty list by an non-empty list. Both forward and backward orders of the linked list are checked.
ListReplaceNonEmptyByEmpty: Calls list_replace() to replace an non-empty list by an empty list. Both forward and backward orders of the linked list are checked.
ListReplaceNonEmptyByNonEmpty: Calls list_replace() to replace an non-empty list by an non-empty list. Both forward and backward orders of the linked list are checked.
ListCopyEmpty: Calls list_copy() to copy an empty list. Both forward and backward orders of the linked list are checked.
ListCopyNonEmpty: Calls list_copy() to copy an non-empty list. Both forward and backward orders of the linked list are checked.
list_delete() is called for every list created in each test case.
list_delete() is called for every list created in each test case. Memory leak is also checked for the following test cases (See Memory Leak).
ListNewListPushBackOneListPushBackMultipleListPutNewOneListPutNewMultipleListPutNewMultipleCommitListPutUpdateOneListPutUpdateMultipleListPutUpdateMultipleCommitListRemoveOneListRemoveAllListClearEmptyListClearNonEmptyListReplaceEmptyByEmptyListReplaceEmptyByNonEmptyListReplaceNonEmptyByEmptyListReplaceNonEmptyByNonEmptyListCopyEmptyListCopyNonEmptyget_lca() function. Memory leak is also checked for the following test cases (See Memory Leak).
Ahead: One commit is ahead of another commit.Diverged: The two commits have diverged.init.in: Tests the following commands:
basic-operations.in: Tests the following commands:
basic-status.in: Tests the following commands:
basic-checkout.in: Tests the following commands:
merge-simple.in: Tests step 1-6 of the merge command.
merge-case123.in: Tests step 7 cases 1-3 and the following steps of the merge command.
Nodes in a circular doubly linked list have prev and next pointers to the previous and next node of the linked list. Also, the head and the tail of the linked list are connected such that prev of the head points to the tail, and next of the tail points to the head.
A sentinel node is a special dummy node in the linked list, such that the head of the linked list is always not a nullptr. This simplifies linked list operations. For Blob sentinel nodes, you should set name and ref to empty strings, and commit to nullptr.
list_new(). This linked list only has the sentinel node (shown in grey): Blob objects by list_push_back(), in the order of file1.txt, file2.txt, file3.txt. The sentinel node is not removed after inserting other nodes:
Here is the directory structure of .gitlite
.gitlite/
├── HEAD
├── STAGE
├── TREE
├── blobs/
│ ├── 886f26a88c0101a50eacab8f1e46284b7e393165
│ ├── bce8ce9ce1a5b01048bee798b6aafb1fbbb082bb
│ └── c94d06a50f019d9179ae8fc7c05874a4151f0dc6
├── commits/
│ ├── 23/
│ │ └── 2399af262814396b707396d7e15544a8c923758e
│ ├── 26/
│ │ └── 26b23b442c918e619c76cca053d0c5a07049d1e7
│ ├── 56/
│ │ └── 56719c4a6cc3e593f619750cb874354029c40347
│ ├── 5b/
│ │ └── 5b71d89dc20868d0a10992bd54aad6995a9ed21f
│ ├── 64/
│ │ └── 6402d3807e872b7ac020fed236da57890ac3e38f
│ ├── 79/
│ │ └── 79133e343b15efb900842de2278e3eaf9f09b619
│ ├── 7a/
│ │ └── 7abba4dee6325ab0e01ef92c0c1c776f1ecff2b1
│ └── ff/
│ └── ff6286141de2e6bf58c6cf314d3f8dffa7ddab96
├── index/
└── refs/
├── master
└── new
HEAD: stores the name of the current branch.
STAGE: stores the names and SHA-1 values of the staged files. This does not exist in real Git.
TREE: stores the names and SHA-1 values of the tracked files. This name borrows from real Git, that allows tracking files in sub-directories. So the tracked files in real Git form a tree.
blobs/: a directory storing all archived files. The filenames are the SHA-1 values of the files.
commits/: a directory storing commit information. It contains some sub-directories used as a simple hash table (which you will learn later in this course). Commits go into these sub-directories according to the prefix of their SHA-1 values. And the commits themselves are also named after the SHA-1 values.
index/: a directory storing all staged files with the filenames and content when they were staged.
refs/: a directory storing the branch information. Each branch is associated with a file with the branch name as filename. The content is the commit id of the head commit of that branch.
The design of the tasks and the automated tester is inspired by and adapted from Gitlet, a course project from CS 61B from Berkeley.
This assignment uses two third-party libraries: Cereal for serialization, and TinySHA1 for SHA-1 computation. The licenses are included under include/.
This assignment was originally proposed and designed by Benran HU.