Homebrew
Last updated
Last updated
As a requirement of the University of Victoria’s Documenting and Understanding Software Systems course, our team was tasked with exploring the architecture of an open source software project of our choosing. The four eligibility criteria were:
the project must be active (i.e. daily commits)
moderately large (>100 stars on GitHub)
only moderately complex, and
open source.
Given that all of our team members were macOS users, we chose to study Homebrew.
Homebrew is a free and open-source software package management system that simplifies the installation and upgrading of software (and its specific dependencies) on Apple's macOS operating system. Dependencies are handled automatically, such that a user may simply run brew install <package_name>
and <package_name>
's dependencies are automatically downloaded, verified, and installed beforehand. One notably convenient service that Homebrew provides is a cache of precompiled binaries (compiled specifically on/for macOS) which greatly increases the convenience of installing developer tools, in particular, since the macOS user is not required to install/configure the compilation toolchain(s) on their local machine.
Homebrew's main focus since its very inception was user-friendliness - both from a user experience perspective, and also from the perspective of a new- or would-be open source contributor. Accordingly, the package manager system core and its thousands of individual "Formulae" are written in Ruby - a high-level, open-source language that is "natural to read, and easy to write".
This user-centric design has been very effective, and has contributed to Homebrew's success as an extremely popular open source project:
Homebrew has an active community of over 10,000 contributors on GitHub.
In 2010, Homebrew was the third-most-forked repository on GitHub.
In 2012, Homebrew had the largest number of new contributors on GitHub.
In 2013, Homebrew had both the largest number of contributors and issues closed of any project on GitHub.
Table 1. Homebrew stakeholders
Role
Concerns
Instances
Active Maintainers
Oversee pull request issues and determine if they are real issues with viable solutions to improve the product
Invited members who have made consistently high quality contributions and advanced updates.
Contributors
Add pull request issues and propose an alternative or solution
Infrequent contributors to the project who may need assistance from the maintainers.
Independent Developers
Ensure that the system is correctly and efficiently installing their applications
Entrepreneurs and solo developers who use Homebrew to promote and distribute their applications.
Users
Determine the system’s performance and efficiency and make use of it
Thousands of macOS users around the world who depend on Homebrew to install and upgrade their systems' packages.
Sponsors
Provide support for the overall functionality of the system
MacStadium hosts Xserve ESXi boxes for continuous integration and testing; DigitalOcean hosts Jenkins continuous integration server; Commsworld hosts physical hardware; Bintray hosts the binary packages; Netlify hosts the brew.sh website; and AgileBits provides secure password storage and synchronization.
Contributing to the growth and continuity of the organization - The Homebrew system is the sole reason for the existence of the Homebrew organization.
Improving business processes - Streamlining the delivery and management of packages allows for more efficient development and distribution of software by users (largely developers).
Managing change in environmental factors - Automatic dependency management allows for flexibility in the configuration of a system, i.e. a dependency change can be dynamically accommodated.
Managing market position - Homebrew leverages the open source community's contributions, which help to keep its package index current.
Managing the quality and reputation of products - Homebrew provides (by default) only stable versions of software, ensuring maximal compatibility and system stability.
Meeting responsibility to society - Homebrew exists fundamentally to improve the package management strategy for macOS users. It is free and open source.
Checking package dependencies must meet a time constraint [1]
Verifying dependency integrity must meet a time constraint [2]
The shell autocompletion must meet a time constraint and avoid unnecessary disk reads [3]
Checking for updates must avoid unnecessary disk operations on unchanged files [4]
All external resources (taps) must be sandboxed by default [5]
Running software commands with sudo must be disabled [6]
Sensitive software tokens and private keys must be hidden [7]
The package retriever must get and validate package checksums [8]
The system must be able to create and maintain external sources (taps) for Formulae/commands [9]
Homebrew must be installable via URL [10]
Figure 1. Utility tree
Table 2. QAS I for Performance in template form
Aspect
Details
Scenario Name
Homebrew listing dependencies at least 25 packages per second
Business Goals
Improve business processes
Quality Attributes
Performance
Stimulus
User
Simulus Source
Command brew uses
Response
System outputs its dependencies
Response Measure
The system lists at least 25 packages per second
Table 3. QAS II for Performance in template form
Aspect
Details
Scenario Name
Homebrew scans package dependency libraries at a rate of at least 10 packages per second
Business Goals
Improve business processes and speed
Quality Attributes
Performance
Stimulus
User
Simulus Source
Command brew linkage
Response
System outputs its dependencies
Response Measure
The system scans package dependency libraries at least 10 packages/second
Table 4. QAS III for Security in template form
Aspect
Details
Scenario Name
User entering command sudo brew
causes Homebrew system to raise an error and return a warning
Business Goals
Managing the quality and reputation of products
Quality Attributes
Security
Stimulus
User
Simulus Source
Command sudo brew
Response
System raises an error
Response Measure
The system does not execute command and outputs the correct warning for usage of sudo
This diagram presents a high level overview of the elements and the relationships among them that populate the view. This diagram includes the primary elements and relations of the view, but not necessarily all of them.
From Milestone 2, the project group identified several Quality Attribute Scenarios, and this primary presentation was built through simulation of these specific QAS.
Figure 2. Module view primary presentation diagram
This section describes and expands on the elements of the Primary Presentation above.
Library: Library contains the Homebrew module and the taps.
Homebrew: Homebrew package contains the package and dependency management logic to install Formulae, but does not contain the actual Formulae for the software packages that users want to install.
Taps: Tap (implemented as a Git repository) contains Formulae and commands. By adding more "Taps", the list of Formulae that Homebrew tracks can be increased.
Formula: Defines a package. Describes the dependencies, source and installation of a software package. Each Formula class is inherited from Formula class defined in "Formula.rb" file in Homebrew module.
Cellar: Location on user's system where packages are installed.
bin/brew: This module/file acts as an entry point and initializes environment variables. Sets up the location of the local Homebrew repository and executes the brew.sh
shell script.
brew.sh: Other than preparing Homebrew for execution (by setting up environment variables like system information, user's GitHub token, Cellar and cache locations), it primarily decides whether the user executed command is a normal end user command or a developer command.
cmd: Contains Ruby and Bash scripts that deal with normal operation of Homebrew system. Tasks like installation, searching of packages, updating package index, and other instructions all fall under this category.
dev-cmd: These are commands that a developer would run. Commands like brew create
, brew new tap
, brew audit
, etc. all fall under this module.
completions: This module contains tab completion indexes which provide for *automagic* tab completion in Bash and Zsh shell environments.
This diagram shows the interactions between the Homebrew system and its environment.
Figure 3. Module view context diagram
This Sequence Diagram depicts a typical use case scenario for a Homebrew user - installing a package. This diagram should reveal the ordering of interactions (and message passing) among the elements and illustrate the sequential dependencies of this use case scenario.
Figure 4. Module view sequence (behavior) diagram
Homebrew is the missing package manager for macOS. It’s widely popular for being a free and open-source software system. Its main purpose is to allow users to easily install software packages and automatically manage their dependencies - something that was not implemented by Apple. The organization of Homebrew's modules is primarily influenced by three main quality attributes - Security, Maintainability, and Availability - through which our project team has identified several quality attribute scenarios:
Security
User creating a Tap makes Homebrew sandbox the Tap by default.
Maintainability
User entering command brew install
causes the system to perform an update (if the system hasn’t been updated in the last 24 hours) ensuring that the most recent information is available, and then switches to normal package installation mode.
Availability
User entering URL source for package in brew install
command causes the system to install the package from the URL source.
Homebrew’s design allows for security for both users and maintainers. Homebrew’s structure has a key feature which sandboxes all newly created user taps to protect users from non-Homebrew Taps. Additionally, all downloaded binaries have their SHA-256 hashes calculated and verified against package publisher's values prior to install, which ensures package integrity.
Homebrew’s position as a leading open source project on GitHub also allows for it to be easily available and accessible to all users, maintainers, and contributors. Users can install Homebrew from Homebrew’s GitHub repository through a one-line shell script. Users can easily install unlisted packages with a short URL - for example, users can enter the command brew install http://example.com/<some_package>.rb
and have the package install directly to their local machine.
An additional benefit of utilizing GitHub is that it allows for Homebrew to be maintained conveniently, meaning that it requires its community-contributed modules to be easy to create and update. It also requires that its package index is continually updated to include the newest available versions of packages. Since it is a free (non-commercial) project, the burden of its maintenance and development rests solely on volunteers, with a relatively flat organizational hierarchy, and no corporate oversight.
The connector and component view shows the software system as a set of collaborating units of runtime behavior. In this view, Homebrew is illustrated in a Primary Presentation diagram in the Client-Server style, and its elements and interface documentation are described in the Element Catalog. Homebrew relies on both system utilities and external sources - this is illustrated by the Context Diagram. The Variability Guide illustrates where things can change in the Homebrew's system. The Behaviour Diagram captures the workflow of the typical use case scenario of Homebrew - a user installing a package. Finally, the Homebrew creators' decisions and reasoning for the current organization of Homebrew's components is presented in Rationale.
To give a better behavioural understanding of the brew uses
, brew linkage
, brew update
Homebrew commands, as well as the tab completion feature, a Component and Connector view has been provided for a high level runtime description of the system. This diagram includes the code trace from the user's entry of the command all the way through the relevant system/Homebrew components, and finally to the obtained results.
Figure 5. Component and connector view primary presentation diagram
client: Represents the terminal used to interact with Homebrew.
brew.sh: Other than preparing Homebrew for execution (by setting up environment variables like system information, user's GitHub token, Cellar and cache locations), it primarily decides whether the user executed command is a normal end user command or a developer command.
linkage service: Abstract representation of linkage.rb script. This service checks the library links of an installed formula. LinkageChecker class is used to do the actual heavy lifting of checking dependencies.
uses service: Abstract representation of uses.rb script. Displays the the formulae that specify a certain "formula" as a dependency. (Formula : The package definition)
update service: Abstract representation of update.sh script. Fetches the newest version of Homebrew and all formulae from GitHub.
completions service: Provides tab completions for bash and zsh.
Local Homebrew Installation: Represents the actual local installation of Homebrew. For e.g. : "/usr/local/Homebrew"
Taps: A Git repository of Formulae and/or commands. Allows extension of core Homebrew functionality by providing capability to add additional formulae and/or external commands and can be used any Homebrew user.
Keg: The installation prefix of a Formula found in homebrew-core.
Homebrew_Cellar: Location on user's system where packages are installed.
System Library: Software packages that are installed on the user's system outside of Homebrew ecosystem that the Formula depends on.
Remote Repositories: External sources of packages. E.g. Github
5.3.1 Interface 1
Table 5. Interface I
Interface identity
CheckFormulaLibraryLinksInterface
Resources provided
Syntax: linkage [--test] [--reverse]
. Each of these are provided as command line arguments.
Semantics: Arguments in [ ] are optional and arguments in < > are required.
Error handling: Resource state is impacted depending on the error.
Data types
"Formula" is the package definition.
Arguments are command line arguments.
5.3.2 Interface 2
Table 6. Interface II
Interface identity
HomebrewUpdateInterface
Resources provided
Syntax: update [--merge] [--force]
- each of these are provided as command line arguments. Semantics: Arguments in [ ] are optional. Error handling: Resource state is impacted depending on the error.
Data types
Arguments are command line arguments.
Error handling
If structure of Taps was changed, user intervention may be needed to make directory structure comply to new structure. If git stash fails user needs to stash/commit manually. User enters `brew update`. This service responds by showing correct usage of the Interface. If Cellar or Directory containing Homebrew is not writable, display message to user to provide appropriate permissions.
Quality attributes
Performance greatly impacted.
Rationale
Designing the interface this way follows the general homebrew command pattern that the user already understands. This interface also allows the script to be executed by the homebrew system easily, which is through shell scripts.
Usage Guide
Usage is straightforward. Running just `brew update` fetches the newest version of Homebrew and all formulae from GitHub using git and performs any necessary migrations. If --merge is specified then git merge is used to include updates rather than default operation of git rebase. If --force (or -f) is specified then a slower and more complete update check is performed even if it is unnecessary.
The Homebrew functions for brew uses
and brew linkage
reside within the cmd
and dev-cmd
directories respectively, contributing to the main Homebrew GitHub repository. In addition to macOS using Homebrew for package installations and updates, Homebrew uses other external entities to update itself and the packages that it manages. The homebrew-core and various Tap repositories provide a list of Formulae (package installation scripts) to be updated once changed. This is also related to the Homebrew repository which hosts all of the main Homebrew commands. The package entities are accessed individually through their respective hosting services. This allows the packages to be downloaded and validated against their checksum. Although brew uses
and brew linkage
are small components of Homebrew, they are important parts of a larger system structure that makes up the macOS package manager.
Figure 6. Component and connector view context diagram
5.5.1 Adding/Removing Formulae in homebrew-core
A contributor may add, remove, or edit Formulae that exists within the homebrew-core repository. Once merged, the Homebrew system will automatically update itself via brew update
.
5.5.2 Adding/Removing Formulae in homebrew-core
An external Tap may include different Formulae than the main homebrew-core repository. Similarly, a contributor to this Tap may add, remove or edit formulae that exists within the repository. Once merged, the Homebrew system will automatically update itself through the use of brew update
. Homebrew will respect the user-set priority for Taps and for Formulae that exist in multiple Taps. If users want a certain Formula to be installed from a specific Tap, they can explicitly mention it at installtime.
This sequence diagram shows the system behavior when client invokes a brew install <formula>
command. This complements the primary diagram quite well in that it shows in more detail how the "update service" works.
Before any package is installed, the Homebrew system is updated. This logic has been abstracted into "update service". In this scenario, the brew update
command is called but with a --preinstall
flag. The latest version of Homebrew and all the Formulae is downloaded and the local git repository is updated. The details of this installation are not shown in the diagram.
After Homebrew system has been updated the brew.sh
module starts the actual installation by invoking the installation service using the command passed in from the client. Install service (install.rb
) then fetches the packages based on user arguments. We don't use the term "binary" here because user may opt to compile the Formula locally instead.
The actual installation of the Formula is delegated to Formula installer service (formula_installer.rb
)
Figure 7. Component and connector view behavior diagram
The Client-Server component and connector view style was chosen for the primary presentation of Homebrew’s core functionality because the client (terminal emulator) uses the server (Homebrew) to obtain results pertaining to package management on their operating system. The client makes requests which are parsed by the brew.sh
file, then calls the appropriate module for the server to fulfill the request. In this case, the primary presentation focuses on the performance attribute by depicting the linkage, uses and update workflows concerning the return rate to the client. While the update service command will update the host's Homebrew repository and Formulae, the linkage
and uses
services will return package dependency lists and display them to the client.
The main focus of the Quality Attribute Scenarios (QAS) portrayed in the primary presentation are performance
. This diagram further refined the QAS’s, including:
brew uses
calculates dependencies for a Formula and displays them at a rate of at least 25 dependencies per second.
brew linkage
calculates dependencies for a Formula at a rate of at least 10 dependencies per second.
The diagram represents our QAS on a higher level that includes processing and run time events. For example: brew update
is a critical functionality because it is called each time a package is installed or upgraded, and therefore has a disproportionate impact on performance. Performance is significantly improved by implementing caching in the update service.
Code quality and technical debt are generally prominent issues in many software projects. With system growth (in terms of lines of code), code quality often deteriorates in the absence of intensive review or guidelines, which obviously introduces technical debt. In the following subsections the Homebrew system's technical debt will be examined and evaluated with multiple code quality analysis tools.
To analyze the code quality and technical debt of the Homebrew base repository (Homebrew/brew), multiple tools were used including SonarQube, CodeClimate, and CodeScene. These tools provided insights into the code hotspots, refactoring targets, maintainability scores, and code smells while giving an overview of the system's technical debt. In the following sections, each tool will be examined with regards to the issues that were identified in their output.
6.1.1 SonarQube
SonarQube and SonarCloud (SonarQube's Platform as a Service) do not support the Ruby programming language, so the project team stood up a self-hosted SonarQube server and added Ruby plugins to it. The only SonarQube plugins available for Ruby are small, unofficial, and independently-developed plugins. The project team experimented with the three most popular suitable plugins:
The plugins all primarily leverage Rubocop, which provides static analysis of Style, Layout, Lint, and syntactic/idiomatic Performance - essentially just linter analysis (see figure 1).
Figure 8. SonarQube's Rubocop-dependent analysis
Deeper analysis of Homebrew with SonarQube was not possible given the limited availability/maturity of Ruby plugins for SonarQube.
6.1.2 CodeClimate
The base Homebrew repository was analyzed on an instance of Code Climate, yielding a maintainability rating, technical debt time, code smell and duplication evaluation. The maintainability rating gives an indication of the project's Technical Debt Ratio (TDR), Homebrew being rated a C as a project with a TDR between 10% and 20%. The estimated time to fix the 638 code smells and 34 code duplicates (shown in figure 15) would take an estimated 6 months of development time.
Figure 9. CodeClimate summary output
The code smells of Homebrew mainly include exceeding recommended file and method lengths as well as excessively high cognitive complexity of methods. Similar to the CodeScene output, the main files with large cognitive complexities include formula.rb
, formula_installer.rb
and audit.rb
, where cognitive complexity is "the measure of difficulty of reading and understanding the code".
6.1.3 CodeScene
The analysis conducted by our instance of CodeScene Cloud proved to be the most insightful. CodeScene Cloud features an intuitive dashboard displaying the scope of the project, complexity warnings, code hotspots and author highlights. Further into the code and technical debt hotspots, a graph is displayed for the complexity trend, change frequency and code churn of each Ruby file. A primary example of this would be the formula.rb
file with 1223 lines of code, 167 commits and a code churn of 48% (see figure 16).
Figure 10. Codescene hotspot analysis for formula.rb
Files similar to formula.rb
in terms of technical debt hotspots include audit.rb
, diagnostics.rb
, and formula_installer.rb
with many lines of code and high code churn. CodeScene has other tools to evaluate a system's complexity trends and corresponding warnings. Once a file reaches a certain level of complexity relative to the file's length, a warning is raised. Currently there are 3 files with high complexities including audit_spec.rb
, doctor.rb
and lines_cop.rb
.
Figure 11. CodeScene refactoring targets
Nearly every project maintainer would love more time to refactor the codebase, but this is not always feasible. The "Refactoring Targets" tab helps developers prioritize improvements to the highlighted files, with the modules in red being the most serious. This correlates to the Hotspot tab which indicated sections of code with most activity. There is a significant amount of overlap between the most active modules and the modules with the most debt.
We started our investigation by agreeing on key indicators of technical debt applicable to the Homebrew project:
Entities that require constant modification to remain useful,
Entities that only a few people understand,
Non modular entities,
Size of methods/files.
Based on these indicators and using the output from the array of code analysis tools, we identified the following modules as having the highest interest rate for technical debt:
1. brew/Library/Homebrew/formula.rb
- While a lot of modifications occur due to changing dynamic of installing packages, formula.rb
module is an example of a monolith [11]. Key indicators of debt are code churn, overly complex methods and the majority of commits made by a single person indicating this module is not well understood by everyone.
2. brew/Library/Homebrew/formula_installer.rb
- Key indicators are code churn, complex methods and a lion share of commits by a single person. An example of complexity in this module is "install_dependency" method.
3. brew/Library/Homebrew/diagnostic.rb
- Monolithic, a majority of commits are made by a single author as well as containing high code churn.
While these modules are not the only entities with technical debt, they are certainly the quick hits. Refactoring the most complex methods of these modules would provide greatest reductions in technical debt.
Technical debt may also be described as "the gap between making a change perfectly ... and making the change work as quickly as possible, with as few resources as possible"[12], and to this end, we sought out any design and management decisions made on the Homebrew project that resulted in increased maintenance overhead for developers and/or users at a later stage.
For example, the formula.rb
file has become a monolithic piece of software, with high cognitive complexity and long in terms of lines of code. Due to the nature of this file and the centralized functionality it possesses, the file has naturally accreted functions over time. The formula.rb
file is the structure for all Homebrew Formula package definitions that are stored locally on the users machine. Since Homebrew’s primary functionality is driven off of Formulae, the Formula structure is sensitive to mistakes that could lead to issues further down the line. Although, to reduce cognitive complexity the Homebrew organization should consider separating this module into sub-modules.
The analysis of Homebrew’s structure and generating rough high-level documentation for the system has given a great opportunity to learn more about the innerworkings of an open source project. Homebrew has migrated from a personal repository to a public repository that any developer can contribute to. During the process of identifying stakeholders, business goals, architecturally significant requirements, generating a module view, generating a C&C view, and analyzing the project technical debt, a few issues became apparent to the team.
The main issue was realizing the evolution of Homebrew from a personal project to large-scale open-source repository/community had been done without the aid of a architect. Although there are core maintainers to oversee the Homebrew’s direction and focus, overall the system is largely comprised of scripts and modules and takes little to no structure. This has led to confusing architectural views as well as technical debt analysis mainly comprised of code quality issues.
For a future project in the Documenting and Understanding Software Systems
course at the University of Victoria, it would be wise to choose a system with a clearly defined and intentionally designed architecture. Homebrew is a brilliantly successful project that has provided a great learning experience, but for the sake of drawing parallels between the project and our course materials, a project with corporate support and budget for architects may provide more opportunities to observe software architecture in practice.
[1] MikeMcQuaid, "brew uses should be (much) faster · Issue #3007 · Homebrew/brew", GitHub, 2018. [Online]. Available: https://github.com/Homebrew/brew/issues/3007. [Accessed: 01- Feb- 2018].
[2] MikeMcQuaid, "brew linkage should be faster (if possible) · Issue #3008 · Homebrew/brew", GitHub, 2018. [Online]. Available: https://github.com/Homebrew/brew/issues/3008. [Accessed: 01- Feb- 2018].
[3] adamv, "tab-completing Formula names is very slow · Issue #20 · Homebrew/brew", GitHub, 2018. [Online]. Available: https://github.com/Homebrew/brew/issues/20. [Accessed: 01- Feb- 2018].
[4] MikeMcQuaid, "update.sh: further speed up brew update. by MikeMcQuaid · Pull Request #669 · Homebrew/brew", GitHub, 2018. [Online]. Available: https://github.com/Homebrew/brew/pull/669. [Accessed: 01- Feb- 2018].
[5] MikeMcQuaid, "sandbox: sandbox all taps by default. by MikeMcQuaid · Pull Request #2898 · Homebrew/brew", GitHub, 2018. [Online]. Available: https://github.com/Homebrew/brew/pull/2898. [Accessed: 01- Feb- 2018].
[6] MikeMcQuaid, "sandbox: sandbox all taps by default. by MikeMcQuaid · Pull Request #2898 · Homebrew/brew", GitHub, 2018. [Online]. Available: https://github.com/Homebrew/brew/pull/2898. [Accessed: 01- Feb- 2018].
[7] MikeMcQuaid, "Hide sensitive tokens from install/test/post. by MikeMcQuaid · Pull Request #2524 · Homebrew/brew", GitHub, 2018. [Online]. Available: https://github.com/Homebrew/brew/pull/2524. [Accessed: 01- Feb- 2018].
[8] mistydemeo, "Add vendored sha256 by mistydemeo · Pull Request #2684 · Homebrew/brew", GitHub, 2018. [Online]. Available: https://github.com/Homebrew/brew/pull/2684. [Accessed: 01- Feb- 2018].
[9] "Homebrew/brew", GitHub, 2018. [Online]. Available: https://github.com/Homebrew/brew/blob/master/docs/How-to-Create-and-Maintain-a-Tap.md. [Accessed: 01- Feb- 2018].
[10] mistydemeo, "brew reinstall fails when given a URL · Issue #27117 · Homebrew/legacy-homebrew", GitHub, 2018. [Online]. Available: https://github.com/Homebrew/legacy-homebrew/issues/27117. [Accessed: 01- Feb- 2018].
[11] "The Blob", Source Making, 2018. [Online]. Available: https://sourcemaking.com/antipatterns/the-blob. [Accessed: 15 - Mar- 2018].
[12] "Identifying and Measuring Technical Debt - IEEE Software Boeing", On Technical Debt, 2018. [Online]. Available: http://www.ontechnicaldebt.com/blog/identifying-and-measuring-technical-debt-ieee-software-boeing/. [Accessed: 15 - Mar- 2018].
[13] ilovezfs, "python 3.6.4, python@2 2.7.14 (new formula) by ilovezfs · Pull Request #24604 · Homebrew/homebrew-core", GitHub, 2018. [Online]. Available: https://github.com/Homebrew/homebrew-core/pull/24604. [Accessed: 15- Mar- 2018].