GSoC Week 11

A quick week

This week flew by very quick. It feels as if I wrote this blog a couple of days back. There is good progress as compared to last week and all issues listed in the previous blog are sorted! I did not realise this until I started writing this blog. Feels good.

Now only one thing is left to sort out is one small issue which went unnoticed for so long but finally come to light after some investigation from Kaartic on his lxconf repository. I also need to add a couple of tests in t7401 which are related to the aforementioned issue(s).

The issues

Initially, there were two issues. But after rebasing my commits related to solutions of the other problems, one got solved but the other persisted. I will talk about both of them, starting with the solved one.

Printing of some extra things in the summary of a deinitialised SM

When I was trying to find the summary of a deinitialised SM, I should not have got any output since the SM is deinitialised. To my surprise, I still got an output

 # Predicted output (output from the shell version)
 $ git submodule summary 07a20f569b4b1690e717eaac0954007a8edfbfc2
 --nothing--
 
 # Actual output (output from the C version)
 $ git submodule summary 07a20f569b4b1690e717eaac0954007a8edfbfc2
 fatal: not a git repository: '.git'  
 fatal: not a git repository: '.git  
* sha1collisiondetection 1603399...855827c:  
Warn: sha1collisiondetection doesn't contain commits 16033998da4b273aebd92c84b1e1b12e4aaf7009 and 855827c583bc30645ba427885caa40c5b81764d2

As you may see, that’s a lot of damage. On top of this, the tests and the build were passing! The thing is that there should not be an output for directories which are not a GIT_DIR. In the shell version this was happening:

GIT_DIR="$sm_path/.git" git rev-parse --git-dir >/dev/null 2>&1 &&
printf '%s\n' "$sm_path"

The rev-parse documentation states this about the --git-dir option:

--git-dir
Show $GIT_DIR if defined. Otherwise show the path to the .git directory. The path shown, when relative, is relative to the
current working directory.

If $GIT_DIR is not defined and the current directory is not detected
to lie in a Git repository or work tree print a message to stderr and
exit with nonzero status.

Therefore we actually had to check for the GIT_DIR and not the activity of the SM like I was actually doing. The checking for GIT_DIR is a subset of the is_submodule_active() check. Therefore, I had to use the is_nonbare_repository_dir() function to make sure things go fine and the check is accurate. Rebasing this commit onto my main branch for the v2 fixed this issue. This was because of the use of is_nonbare_repository_dir() because for deinitialised SMs, reading the gitfile is not possible.

Printing of a git log error prompt for non-existent SMs

It is good that the previous issue got solved but this issue is even more annoying. For SMs which do not exist in the working directory of the superproject, if we try to find a summary of them, we except a “not a repository” warning and max to max a couple of warnings here and there. But in fact we get this output:

# Predicted output (output from the shell version)
$ git submodule summary 4efb51bdef8916013b8490ef087a1c3712b907c7 -- lxconf-bash  
fatal: not a git repository: 'lxconf-bash/.git'  
* lxconf-bash 2c819db...0000000:

# Actual output (output from the C version)
$ git submodule summary 4efb51bdef8916013b8490ef087a1c3712b907c7 -- lxconf-bash  
fatal: exec 'rev-parse': cd to 'lxconf-bash' failed: No such file or directory  
* lxconf-bash 2c819db...0000000:  
fatal: exec 'log': cd to 'lxconf-bash' failed: No such file or directory

The rev-parse error is appropriate since this is what happens after we learn that the SM does not exist. But the log error was not really expected. According to my mentor Kaartic, it may be because of some lack of check even after we know that the SM does not exist. This will require some further investigation but this is what I have found till now is that this is happening because of the control flow passing into this part of prepare_submodule_summary():

if  (p->status  ==  'D'  ||  p->status  ==  'T')  {
	generate_submodule_summary(info,  p);
	continue;
}

which is justified since we enter this case when p->status may be D i.e., the deletion of something, in this case a SM. But the main thing happens after that since I think we should not allow git log to run in case of an absence of a GIT_DIR. I must put some checks there.

The solved issues

The 3 issues at the end of last week’s article are solved (1 was solved a that time). To recall, the three issues were:

  1. Improve the for-status block since it had some repetitions.

  2. Use is_submodule_active() at the end of prepare_submodule_summary() instead of spawning a child process.

  3. Compute the error messages in generate_submodule_summary() and pass them to print_submodule_summary() as a char *.

In addition, other issues which came up this week apart from those mentioned in the two subheadings before are:

  1. Use get_oid() in module_summary() instead of spawning a new child process. We initially did this to find and verify the hash of the HEAD of a submodule

     cp_rev.git_cmd = 1;
     argv_array_pushl(&cp_rev.args, "rev-parse", "-q",
     	"--verify", argc ? argv[0] : "HEAD", NULL);
    
     if (!capture_command(&cp_rev, &sb, 0)) {
     	strbuf_strip_suffix(&sb, "\n");
     	if (argc) {
     		argv++;
     		argc--;
     	}
    

    but instead we could have used the get_oid() function here and pass the resultant OID directly into compute_summary_module_list() instead of passing it as a string like we were doing since. Therefore the fix is something like:

     if (!get_oid(argc ? argv[0] : "HEAD", &head_oid)) {
     		if (argc) {
     			argv++;
     			argc--;
     		}
    
  2. Implement Stefan Beller’s suggestion of using head_ref_submodule() in generate_submodule_summary(). In generate_submodule_summary(), we could have not spawned a child process to find the HEAD ref of a submodule. The thing is that head_ref_submodule() has been deprecated therefore we needed newer alternatives. This quote from Kaartic sums it all up:

    A quick search reveals that ‘head_ref_submodule’ existed during that
    period. On further investigation it seems that ‘refs_head_ref’ was
    introduced in
    62f0b39
    (refs: add refs_head_ref(), 2017-08-23) and ‘head_ref_submodule’ was
    made to use it. Later on, in
    419221c
    (refs: remove dead for_each_*submodule(), 2017-08-23),
    ‘head_ref-submodule’ was removed with an advice to use the 'refs
    ’ API
    for accessing submodules.

    Therefore we actually needed to use the refs_head_ref() function for this purpose. The result you may find here.

  3. Use index_fd() instead of spawning a process to call git hash-object in generate_submodule_summary(). Before, we were spawning a whole child process to calculate the hash-object and thus create a blob from a file (in our case the SM). A smaller and more efficient solution can be to use index_fd() along with fstat() to do the same task. This is an issue I have solved as of now with the result being available here. It needs some minor changes like precise error messages and indentation fixes now.

Next steps

Now I need to correct the git log error and hopefully things will be fixed. I also need to write a couple of tests for the same. Let’s get this wrapped up now!

Over and out,
Shourya Shukla

Comments

Popular posts from this blog

The Final Report

GSoC Week 4 [One month special]

GSoC Week 10