Open source contributions
This chapter outlines some of the contributions---in the form of pull requests---made by several teams to their open source project.
Keras
This pull request fixes a reported issue in the preprocessing module of Keras related to tokenizing text input data. It corrects the behavior to allow for multi-character separators between tokens in the input, which had been documented as the intended behavior, but was not correctly implemented. Providing support for the differing string implementations in Python 2 and Python 3 introduced some additional complexity to the change. The pull request also included regression tests for cases which would have failed on the previous version.
François Chollet, the lead developer of Keras, asked about the performance impact of the change, and a variety of benchmarks for different use cases were provided by our team. The pull request was merged shortly after.
Toyplot
When formatting tables, developers can opt to use a Formatter
class for their values, however Toyplot only had Formatter
s for Float
values. In our pull request we added Unit and Currency Formatters, and accompanying tests and documentation. Users can now use our classes to format different units such as centimeters, inches, and pixels, or format currencies such as euros, pounds, or different dollars.
Cartographer
https://github.com/googlecartographer/cartographer/issues/53
Issues encountered with submitting a pull request
Due to the nature of Cartographer our team had a difficult time making a meaningful contribution to the code base. Cartographer is in a late stage of its development. In 2016 Cartographer was released as open-source code, but it had already been in development for 2 years. Cartographer has been a complete product for awhile now and is written in a style which we assume to be inhouse to Google that we are unfamiliar with. Our team has looked through the issue list, but has had a hard time finding a good issue to pursue.
Cartographer, as can be grasped from the rest of the report, is a complicated program that is very tightly coupled. Cartographer has already been through many steps to increase performance and is built around threading and protobuffers. These optimizations mean small changes can have far reaching effects that our team is unsure of due to very limited C++ knowledge.
One of the open topics for Cartographer that we were advised to look at was a refactor. This refactor aims to increase code quality and security by changing functions in a file to const type. Changing these functions to const type will help protect data inside objects in Cartographer. Due to limited C++ knowledge our team was uncomfortable making such a change as the code follows a style which our group does not understand well enough to write. While it is still possible to trace code and gain insight, changes to major files are not something we wanted to burden the Cartographer team with. Another problem we noticed is the target of the change has a header file, but no source. Our team has made the assumption this request is obsolete due to the current refactoring of mapping_2D and mapping_3D.
The issue we are attempting to pursue involves documentation with Docker. Following the contribution guide laid out in Cartographer, we tried to make contact with the team in February in order to seek some guidance. Unfortunately it seems our request was lost and we were unable to connect with the team. Due to the requirements of our project we decided to move forwards with this request to the best of our ability. As is we have submitted some general building and running documentation, but our unsure of where to move from here. We hope our contribution will make an advancement in the codes documentation or act as a stepping stone for future documentation.
RocketChat
While using CodeScene as the code quality analysis tool, we identified issues in the following aspects of Javascript code for uploading files to Amazon server: 1. The error handling for a valid file url was missing. 2. The codebase did not use the Node’s event handling pipeline and thus was divergent from that standard.
The issue has been reported at https://github.com/RocketChat/Rocket.Chat/issues/10289. The error handling and the asynchronous call functionality was added to the codebase.
SwaggerUI
After our analysis and system study, we found that the component 'HighlightCode' does not handle large inputs very well. The component would expand as far as possible to fit the content(normal behavior), making the page extremely long and hard to use. Since Usability is a part of our QAS, we thought of fixing this. This could also be related to #3640. By making the component scrollable, the page remains manageable even when previewing very large responses. We've also added a button to download the contents displayed in the 'HighlightCode' component.
Tensorboard
According to the report of SonarQube and CodeScene, duplication is one of the main issues causing the technique debt as the expansion of this project. In addition, SonarQube reports some conditional structures contain two branches with same implementation. The pull request tackles these two issues. We use class inheritance to eliminate the duplication and delete the redundancy branch in "if" structure. Furthermore, we modified the related files to ensure the validity of this project. After finishing all the changes and tests, we use SonarQube to scan the project, the duplication rate of the project decreases from 9.2% to 2.5%.
Strapi
Strapi with its new version is now compatible with Mongo and SQL databases. By convention, the SQL databases uses the id attribute to identify the entries. But, Mongo uses _id as the identifier. To make the backend and frontend part uniform duplication of the Mongoo _id attribute is implemented.
However, We found that the id and _id are duplicated in the JSON response from the API as well. It is unnecessary to return redundant value of ids (id and _id) with every API request made. So, we are proposing changes to Strapi-generate-api package to bandaid this issue with this pull request,
Homebrew
Improving maintainability, performance, and reliability in
brew search
The
brew search
logic currently searches broadly, and then processes the JSON response locally to filter some results out. We saw an opportunity to improve on this approach by making the search more specific, which avoids having to filter out the response. This improves the performance of the search function.This change also simplified the source code itself, which increases the maintainability for developers, with the added bonus of reducing cognitive complexity.
Since the local Hombrew installation makes a network transported query to the GitHub API server, it's important to handle the errors in case of network issues, server crashes, or similar. Implementing a
begin
-rescue
block provides a measure of error handling, such that a failure in one area will not cause the running program to fail altogether. This inherently increases the reliability of the system.
Homebrew relies on the open source community to maintain its Formulae and Casks, so when a new version of a Homebrew-indexed application is released, the version must be updated in Homebrew's Caskroom. This involves downloading the newly released product as a binary, calculating the SHA256 hash of the binary, and verifying that value against the hash provided by the product owner.
Last updated