Posts

The Final Report

The Final Report This blog will describe my GSoC project along with a small account of how I started contributing to Git. My GSoC '20 project was to convert 'submodule' to a builtin by porting it from shell to C. Initially, Git commands were written in shell with some instances of Perl as well. As times progressed, various platforms to run Git emerged & projects became large (spanning millions of lines of code), enter, problems in production level code, such as: Difficulties in portability of code. The submodule shell script uses commands such as echo, grep, cd, test and printf to name a few. When switching to non-POSIX compliant systems, one will have to use emulation layers to implement such commands on the system. Which is a lot of extra work. There is large overhead involved in calling the command . As these commands implemented in shell script are not buitlins, they tend to call multiple fork() and exec() syscalls for creating more child p

GSoC Week 16

One last time The final evaluations start on August 24 , i.e. , in 2 days. GSoC was a fantastic rollercoaster ride. I learned so much about Git, C, shell scripting, Linux and programming in general. It was the best learning experience I have had to date. There were many highs and lows throughout this whole journey and gladly my mentors, Christian Couder and Kaartic Sivaraam helped me get through it all. This is the final edition of my GSoC blog series since for next week I have to write a blog to summarise, my GSoC project and the experience. Therefore, for one last time, I will give an account of the work I have done in the past week. This week went fine in terms of GSoC. The summary patch series will graduate to master as stated in the “What’s cooking” mail dated August 22 (IST). The t7401 patch series also looks good, and its v3 will also get into the pipeline (currently in seen ). I am happy that I have been able to contribute a significant amount to Git d

GSoC Week 15

Feedback on the patches Finally, I submitted the patches about ten days back (yes, I missed the previous blog because I kind of forgot about it). The patches got fine feedback and required some changes to be good to go. On t7401: modernise cleanup and warn The patch was well received though needed some changes in the commit (t7401: modernise style, 2020-07-23) . It was better to redirect the output of the rev-parse to a file and then cut the output using cut -c1-7 instead of using the pipe | operator. Also, Taylor Blau and Junio C Hamano suggested I drop the commit (t7401: ensure uniformity in the '--for-status' test, 2020-07-10) since the change it made could be combined with the commit (t7401: change test_i18ncmp syntax for clarity, 2020-07-10) . I also introduced a commit (t7401: change indentation for enhanced readability, 2020-08-11) which improves the indentation of the tests in the script for enhanced readability*. Also, the commit (t7401: add

GSoC Week 13

Passed the Second Eval! The results for the second evaluation arrived on the dawn of August 1, and I passed. It felt great to get positive feedback from Christian and Kaartic regarding my performance. Also, I think no one saw my patch regarding t7401, so I will send it once again on the list most probably. Finally, we are done with the v2 of git submodule summary . It was a big rollercoaster, so many bugs and new use cases coming along the way. The one last problem which came in the way may appear to some as very trivial but explains another difference between the shell and the C version. The Problem The problem was that when the shell version of summary was run outside of the test suite, it resulted in two newlines being printed at the end of the output (though it printed only one newline when run inside the tests) whereas the C version printed just one newline irrespective of where it was run. The explanation for inconsistency between the test and non-test ou

GSoC Week 12

Submitting the patch This week I finally submitted a patch onto the List, though it wasn’t the only one I plan on sending to the List. In the last blog, I talked about the git log issue, which is fixed thankfully! There were some other problems which came up, but they are sorted now. The work on summary did reveal some issues in the test script t7401 as well—something which prompted me to create an entirely new test script. I also learnt about regular expressions, so that is another thing I will touch in this blog. The issue with t7401 The test t7401-submodule-summary.sh was written a long time ago, and the command git submodule add did not exist back then. Hence the submodules are added using git add in the test script. This leads to some unexpected behaviour when trying to run commands like git submodule init and git submodule deinit . I came across this issue while trying to add a test to verify the summary output of a deinitialised submodule. The test I

GSoC Week 11

A quick week This week flew by very quick. It feels as if I wrote this blog a couple of days back. There is good progress as compared to last week and all issues listed in the previous blog are sorted! I did not realise this until I started writing this blog. Feels good. Now only one thing is left to sort out is one small issue which went unnoticed for so long but finally come to light after some investigation from Kaartic on his lxconf repository. I also need to add a couple of tests in t7401 which are related to the aforementioned issue(s). The issues Initially, there were two issues. But after rebasing my commits related to solutions of the other problems, one got solved but the other persisted. I will talk about both of them, starting with the solved one. Printing of some extra things in the summary of a deinitialised SM When I was trying to find the summary of a deinitialised SM, I should not have got any output since the SM is deinitialised. To my surpri

GSoC Week 10

Addressing comments on my patch I spent this week addressing the suggestions made by Johannes which, were extremely detailed and long. I am done with most of the suggestions so I will highlight some important ones and what I learned from them. This week, I saw the ocean for I guess the first time in my life and it was a totally unique experience! Peaceful Seas Moving to the comments made by Johannes. Eliminate find_unique_abbrev() The shell version of summary calls git rev-parse twice for two jobs which are: Checking for a missing_src and a missing_dst i.e. , the missing source and destination committish. test $mod_src = 160000 && ! GIT_DIR="$name/.git" git rev-parse -q --verify $sha1_src^0 >/dev/null && missing_src=t test $mod_dst = 160000 && ! GIT_DIR="$name/.git" git rev-parse -q --verify $sha1_dst^0 >/dev/null && missing_dst=t Quoting my mentor Kaartic: The above code assigns missi