Thousands of Forgotten TODOs in Kubernetes Code


Photo by Yancy Min at Unsplash

Kubernetes is a big project. Not only because it is very in demand , but also from the point of view of the source code. At the time of this writing, there were more than 86,000 commits, more than 2,000 participants, more than 2,000 open tickets, more than 1,000 open pool requests and 62,800 stars in the repository on GitHub .

The scc utilitycounted more than 4.3 million lines of code on Go (more than 5.2 million lines), of which more than 3 million lines of real code and more than 700 thousand lines with comments, a total of more than 16,000 files, including a directoryvendor/ .

We recently developed a tool that processes TODO comments in a codebase to help support such large projects.

We decided to set our little parser on Kubernetes sources - and see what happens. Here are some results.

tickgitprocessed the source code from the 9bf52c2 commit . The output in CSV format was then imported into SQLite for query processing. Note that the tool finds TODO only in the extracted tree. It does not take into account comments that have been added and subsequently deleted. Thus, the numbers reflect only TODOs that are still โ€œlivingโ€ in the code at the time of this commit.

Total (for 9bf52c2 )


  • 2380 TODO in 1230 files from 363 authors
  • 460 TODO , , // TODO (patrickdevivo) Fix the ...
  • 489 TODO 2019
  • TODO โ€” 860 ( 2,3 )
  • TODO โ€” 6 2014 ( ยซ ยป)
  • TODO 9 2019 ( )
  • TODO: 33
  • deads2k TODO (git blame): 147
  • TODO, : 64


TODO


33	cluster/gce/util.sh
25	pkg/apis/core/types.go
23	staging/src/k8s.io/api/core/v1/types.go
21	staging/src/k8s.io/legacy-cloud-providers/aws/aws.go
20	staging/src/k8s.io/code-generator/cmd/conversion-gen/generators/conversion.go
20	pkg/apis/core/validation/validation.go
16	test/e2e/network/service.go
16	pkg/kubelet/kubelet.go
14	test/e2e/framework/util.go
14	pkg/kubelet/kubelet_pods.go


TODO


deads2k				147
Clayton Coleman			105
Chao Xu				99
Dr. Stefan Schimanski		93
Jordan Liggitt			81
David Eads			60
Random-Liu			54
Wojciech Tyczynski		50
Yu-Ju Hong			43
Prashanth Balasubramanian	38


, TODO ( TODO )


64	6a4d5cd7cc58e28c20ca133dab7b0e9e56192fe3
19	e01ff1641c7321ac81fe5775f6ccb21aa6775c04
19	4fb28dafad121e163fa86dc90067ce3d14415811
18	adb75e1fd17b11e6a0256a4984ef9b18957d94ce
14	963c85e1c807efcdbb82dd44439dc3c55f6a0bfd
14	8b17db7e0c4431cd5fd9a5d9a3ab11b04e2f0a7e
13	f0f78299348afcf770d4e8d89dcea82f80811b28
11	d0b94538b9744d0c06df6ddec2604be168568f9d
10	f1248b9c829e225138ab6d6234221c63092f7592
10	cd663d7ad00937cffa8a09e4761acb95d34c89a3


TODO


34	2014
249	2015
523	2016
650	2017
435	2018
489	2019


, TODO : tickgit todos --csv-output. SQLite.


This is a pretty cursory look at the TODO comments in Kubernetes source code. We see the most active "task directors", which more or less coincide with the leading contributors of the project.

We also see that the attitude to TODO comments is not different from the norm, just because of the large size of the code base there are also a lot of them.

An important observation is that there are more TODO comments than Github tickets (issues). This is interesting because it indicates a significant number of "hidden" tasks that are not immediately visible on GitHub, but are written in the source code.

Probably, the main contributors are well versed in their areas of the code base and clearly present the number of their own TODOs and โ€œhidden workโ€. But this is not always noticeable to external observers. They are more familiar and understandable to see tickets on GitHub (or in other public trackers).

Most developers understand that software projects "live and breathe." Frequent changes, a process of improvement, bug fixes and a lot of discussions are taking place. It is very important to organize the workflow well, because good code requires constant thought. In part, we see this in action through TODO comments in Kubernetes sources. Although we have nothing to compare with, the average age of tasks of 2.3 years seems rather high. Developers close to the project can more objectively evaluate this indicator. It is interesting to compare it with other large open source projects.

A deeper analysis would include all of the TODOs in history, and not just those that remain at present. You can consider the following issues:

  • How fast do TODOs close?
  • What is the average lifespan of a TODO comment?
  • What do popular codebases look like in comparison?

How important is it?


TODO comments usually cover a type of work that is too small for a ticket, but important enough to be noted and described in a comment (although many refer to tickets / issue). Because comments are part of the code, they are often โ€œcloserโ€ to the work that needs to be done. They are easy to add, but it seems just as easy to forget (Kubernetes sources still have over 1800 TODOs added before 2019).

We hope that our tool for analyzing metadata in code will help developers to service projects of any size. Raising TODO comments to the surface is only part of what needs to be done.

Source: https://habr.com/ru/post/undefined/


All Articles