Posts

Showing posts from 2020

The Final Report

The Final Report This blog will describe my GSoC project along with a small account of how I started contributing to Git. My GSoC '20 project was to convert 'submodule' to a builtin by porting it from shell to C. Initially, Git commands were written in shell with some instances of Perl as well. As times progressed, various platforms to run Git emerged & projects became large (spanning millions of lines of code), enter, problems in production level code, such as: Difficulties in portability of code. The submodule shell script uses commands such as echo, grep, cd, test and printf to name a few. When switching to non-POSIX compliant systems, one will have to use emulation layers to implement such commands on the system. Which is a lot of extra work. There is large overhead involved in calling the command . As these commands implemented in shell script are not buitlins, they tend to call multiple fork() and exec() syscalls for creating more child p

GSoC Week 16

One last time The final evaluations start on August 24 , i.e. , in 2 days. GSoC was a fantastic rollercoaster ride. I learned so much about Git, C, shell scripting, Linux and programming in general. It was the best learning experience I have had to date. There were many highs and lows throughout this whole journey and gladly my mentors, Christian Couder and Kaartic Sivaraam helped me get through it all. This is the final edition of my GSoC blog series since for next week I have to write a blog to summarise, my GSoC project and the experience. Therefore, for one last time, I will give an account of the work I have done in the past week. This week went fine in terms of GSoC. The summary patch series will graduate to master as stated in the “What’s cooking” mail dated August 22 (IST). The t7401 patch series also looks good, and its v3 will also get into the pipeline (currently in seen ). I am happy that I have been able to contribute a significant amount to Git d

GSoC Week 15

Feedback on the patches Finally, I submitted the patches about ten days back (yes, I missed the previous blog because I kind of forgot about it). The patches got fine feedback and required some changes to be good to go. On t7401: modernise cleanup and warn The patch was well received though needed some changes in the commit (t7401: modernise style, 2020-07-23) . It was better to redirect the output of the rev-parse to a file and then cut the output using cut -c1-7 instead of using the pipe | operator. Also, Taylor Blau and Junio C Hamano suggested I drop the commit (t7401: ensure uniformity in the '--for-status' test, 2020-07-10) since the change it made could be combined with the commit (t7401: change test_i18ncmp syntax for clarity, 2020-07-10) . I also introduced a commit (t7401: change indentation for enhanced readability, 2020-08-11) which improves the indentation of the tests in the script for enhanced readability*. Also, the commit (t7401: add

GSoC Week 13

Passed the Second Eval! The results for the second evaluation arrived on the dawn of August 1, and I passed. It felt great to get positive feedback from Christian and Kaartic regarding my performance. Also, I think no one saw my patch regarding t7401, so I will send it once again on the list most probably. Finally, we are done with the v2 of git submodule summary . It was a big rollercoaster, so many bugs and new use cases coming along the way. The one last problem which came in the way may appear to some as very trivial but explains another difference between the shell and the C version. The Problem The problem was that when the shell version of summary was run outside of the test suite, it resulted in two newlines being printed at the end of the output (though it printed only one newline when run inside the tests) whereas the C version printed just one newline irrespective of where it was run. The explanation for inconsistency between the test and non-test ou

GSoC Week 12

Submitting the patch This week I finally submitted a patch onto the List, though it wasn’t the only one I plan on sending to the List. In the last blog, I talked about the git log issue, which is fixed thankfully! There were some other problems which came up, but they are sorted now. The work on summary did reveal some issues in the test script t7401 as well—something which prompted me to create an entirely new test script. I also learnt about regular expressions, so that is another thing I will touch in this blog. The issue with t7401 The test t7401-submodule-summary.sh was written a long time ago, and the command git submodule add did not exist back then. Hence the submodules are added using git add in the test script. This leads to some unexpected behaviour when trying to run commands like git submodule init and git submodule deinit . I came across this issue while trying to add a test to verify the summary output of a deinitialised submodule. The test I

GSoC Week 11

A quick week This week flew by very quick. It feels as if I wrote this blog a couple of days back. There is good progress as compared to last week and all issues listed in the previous blog are sorted! I did not realise this until I started writing this blog. Feels good. Now only one thing is left to sort out is one small issue which went unnoticed for so long but finally come to light after some investigation from Kaartic on his lxconf repository. I also need to add a couple of tests in t7401 which are related to the aforementioned issue(s). The issues Initially, there were two issues. But after rebasing my commits related to solutions of the other problems, one got solved but the other persisted. I will talk about both of them, starting with the solved one. Printing of some extra things in the summary of a deinitialised SM When I was trying to find the summary of a deinitialised SM, I should not have got any output since the SM is deinitialised. To my surpri

GSoC Week 10

Addressing comments on my patch I spent this week addressing the suggestions made by Johannes which, were extremely detailed and long. I am done with most of the suggestions so I will highlight some important ones and what I learned from them. This week, I saw the ocean for I guess the first time in my life and it was a totally unique experience! Peaceful Seas Moving to the comments made by Johannes. Eliminate find_unique_abbrev() The shell version of summary calls git rev-parse twice for two jobs which are: Checking for a missing_src and a missing_dst i.e. , the missing source and destination committish. test $mod_src = 160000 && ! GIT_DIR="$name/.git" git rev-parse -q --verify $sha1_src^0 >/dev/null && missing_src=t test $mod_dst = 160000 && ! GIT_DIR="$name/.git" git rev-parse -q --verify $sha1_dst^0 >/dev/null && missing_dst=t Quoting my mentor Kaartic: The above code assigns missi

GSoC Week 9

First evaluation complete! Last night, the results of the first evaluation came in and I was so happy to see a “Congratulations! You have passed the first evaluation” . It felt really good to see it! A huge thanks to Christian and Kaartic for believing in me :) First evaluation result This does call for a small celebration! After all this is the first time I have earned money from my work! A salary I guess ;) After quite some debugging, finally summary worked. The problem was a bit more skewed than it seemed and was far more serious than just for-status or prepare_submodule_summary() . I will explain what happened and how we waded through to make things first. The Problem As I said above, the problem was far more serious than just an option or a function. I will touch t7418 to make you familiar with what is going on. The first test in t7418 creates a Git repository in a directory called upstream and a submodule called… well submodule . The submodule is added

GSoC Week 8

A busy week This was quite a busy week since I had to make sure the build is perfect as well as give ample time to my research since I need to meet some deadlines. I have been working to make sure the build is a success. In the later parts of the week I have been debugging the the functionality of the --for-status option because of the failure of test 6 of t7418-submodule-sparse-gitmodules.sh . Also, to my delight, my set-branch port is finally merged into master of git/git ! What is the test about? The test t7418 mainly involves finding summaries of submodules in case the .gitmodules is missing. Such a scenario may be created because of sparse checking-out a repository and doing it in such a way that the .gitmodules is not listed in the sparse checkout’s cone. I learned about sparse checkout from this article by one of the GSoC mentors and contributor Derrick Stolee. I will not dive much into sparse checkout since it is out of scope of this article, but

GSoC Week 7

Taking decisions This week was an even more deeper dive into git submodule 's code. I am almost done with the module_summary frontend function. My set-branch port will move to next on git/git hopefully! I learned even more about shell scripting and the shell code of summary . This time, I will present my thoughts here on what I have learned. There may seem to be a bit less information here because of the relatively early write-up of the blog because I finally leave for my home this Friday! <3 Anyway, moving on to the important stuff Current Progress As of now, the module_summary() (the frontend function for the summary subcommand) is almost done with. I have successfully ported it from shell taking some inspiration from Prathamesh’s patch as well. Prathamesh’s patch had 6 functions namely module_summary() , compute_summary_module_list() , submodule_summary_callback() , prepare_submodule_summary() , print_submodule_summary() and finally verify_subm

GSoC Week 6

A hot and humid week This week I have been trying to get deeper into the subcommand summary and its nuances. I aim to start coding the rest of the functions in the coming days. I am complete with the most of the front end module_summary() function, Of course, more edits may follow later but I want to focus on a bare working version as of now. Current Progress In this week I had been studying about Parameter Substitution and Parameter Expansion and the code of git submodule summary . I will talk about what I have learnt from these two concepts in this blog and where are they being used as well as about the commit verification part of summary which gives us a starting point to proceed from. I gave a teaser of Parameter Substitution in my Week 2 blog. Parameter Substitution Parameter Substitution in Shell scripting helps us to substitute the value of a parameter/variable with another value in minimal number of steps. The equivalent of this in C would be a cond

GSoC Week 4 [One month special]

 One month complete! A month has passed now since GSoC started (it will on June 4) and it is a fine milestone I think! I feel happy about it. This week was more about studying things and what is under the hood rather than programming something like it has been always. I was stuck with the callback mechanism of the submodule code. I could not understand the purpose behind it and thought that it was not needed. But as always, the things which confuse me are the ones with very simple and crystal clear logic behind them (but again, not to fret, we learn to walk by falling over and over right?); I will cover the callback mechanism in the blog so that anyone who ever wants to work on submodules or a future student like me has some reference to what is going on. What are callbacks? Callbacks are defined in regard to a function or a segment of a program you call when encountered with a particular trigger (such as an if statement or end of function). And by “you call”, I