The Dependency Checker is a tool to explore both dynamic and static linkage dependencies of binaries and libraries built with FOSS components. Once dependencies are identified, the GUI can provide an easy to interpret visual indication of possible license issues, based on in-house license policies.
The system consists of two pieces, a command-line program "readelf.py" and a GUI frontend that runs in a web browser.
You can view the development source from git, or check it out using standard git commands:
git clone http://git.linuxfoundation.org/dep-checker.git
Bugs can be filed under the Compliance product.
There is also a mail list for discussion of the tool.
The command-line program and the GUI require python. It also runs the OS commands: file, ldd, objdump and readelf, so these should be present on your system. The GUI requires Django, along with sqlite support for the results database. A web browser is also needed to interact with the GUI. If your distribution does not provide Django, you can follow these installation instructions.
The program is packaged as an rpm package, with dependencies on python-django. If your system does not provide django, or it's named differently, you may need to install using --nodeps:
rpm -Uvh dep-checker-0.0.5-1.noarch.rpm --nodeps
Note: If you had to use --nodeps, then you must make sure django is installed and functional on your system. Both the command line program and the gui depend on django.
The installation creates a "compliance" user/group and should create a desktop menu entry to launch the server and open the GUI in your web browser.In the future we may bundle django with the package to make things simpler, as well as provide .deb packaging.
You can also checkout the project from git and run it in place:
git clone http://git.linuxfoundation.org/dep-checker.git cd dep-checkerAlternately, you can get the latest tarball from the git web page by clicking on the snapshot link in the upper right-hand part of the page.
Once you have the tarball, unpack it (example, the numbers of your download may differ):
tar -xf dep-checker-3af829ae0cc5aba33192c000ef0365ef6bced843.tar.gz cd dep-checkerCreate the application database and the documentation (you will need w3m to create README.txt).
makeIf you don't have root permissions on the machine to install Django, you can install it in-place with the dep-checker install:
tar -xf Django-x.x.x.tar-gz cp -ar Django-x.x.x/django dep-checker/compliance cd dep-checker/bin ln -s ../compliance/django .Run the server and the gui should show up in a browser window:
./bin/dep-checker.py startTo kill the django server, you can run:
./bin/dep-checker.py stop
The application installs under the /opt/linuxfoundation namespace:
- bin - command line program and wrapper script to launch gui
- compliance - gui application tree and results database
- doc - License file
- share - Desktop menu files and icons
- compliance - sqlite results database
- __init__.py, manage.py, settings.py, urls.py - generated by django at project creation, settings.py does have some configurable settings. None of the others should be altered.
- linkage - dep-checker GUI code
- media - static html elements such as images, css, javascript files. Documentation is also in this directory.
- templates/linkage - the dep-checker html tree
To run the gui/server (as user compliance for installed package), there is a script that su's to the compliance user, starts the server and attempts to open a browser page to the GUI:
/opt/linuxfoundation/bin/dep-checker.py startTo stop the server run:
/opt/linuxfoundation/bin/dep-checker.py stop
If for some reason this does not work, you can manually perform the steps to start the server:
su - compliance cd /opt/linuxfoundation/compliance python manage.py runserverYou can terminate the server from this console by hitting ctrl-C
The command line program is called readelf.py, and it resides in /opt/linuxfoundation/bin:
Usage: readelf.py [options] <file/dir tree to examine> [recursion depth] Options: -c output in csv format -d write the output into the results database -s DIR directory tree to search --comments=COMMENTS test comments (when writing to database) --project=PROJECT project name (when writing to database) --no-static don't look for static dependencies --version show program's version number and exit -h, --help show this help message and exitThe -c option is primarily used to pass data to the GUI. The format without this argument is more human-readable if you are using the command line directly.
The -s option expects a directory as an argument. If you specify this option, the program will attempt to drill down through the directory mentioned to find only files with the name specified by the next argument to analyse:
/opt/linuxfoundation/bin/readelf.py -s /foo barThe program will search everything under /foo, for ELF files named bar
Specifying only a directory will search and report on every ELF file in that directory tree:
/opt/linuxfoundation/bin/readelf.py /fooSpecifying only a file will attempt to test only the specified file:
/opt/linuxfoundation/bin/readelf.py /foo/bar/bazThe recursion level is an optional argument, that will attempt to not only report the direct dependencies, but also report the dependencies of each library used by the target file:
/opt/linuxfoundation/bin/readelf.py /foo/bar/baz 4This would attempt to recurse down 4 levels from the target file, giving output something like this:
[1]/foo/bar/baz: libtermcap.so.2 [2]/lib/libtermcap.so.2.0.8: libc.so.6 [3]/lib/i686/libc-2.10.1.so: ld-linux.so.2 [1]/foo/bar/baz: libdl.so.2 [2]/lib/libdl-2.10.1.so: libc.so.6 [3]/lib/i686/libc-2.10.1.so: ld-linux.so.2 [2]/lib/libdl-2.10.1.so: ld-linux.so.2 [1]/foo/bar/baz: libc.so.6 [2]/lib/i686/libc-2.10.1.so: ld-linux.so.2You will note that even though we asked for a recursion level of 4, the test stopped at level 3, as the program detects when no further recursion is possible.
Static library dependencies will appear with (static) appended to the SONAME:
libncurses.so.5 (static)
The --no-static option suppresses trying to resolve staticly linked dependencies.
The -d, --project, and --comments are for using the command line program to feed results into the database used by the gui. Setting -d forces -c and compiles the collected results into a list that is fed to the compliance database, where it will show up with the results of tests executed from the gui. The --project and --comments are optional, as they are from the Check Dependencies tab. Multi-word strings should be enclosed in quotes. Here is an example:
/opt/linuxfoundation/bin/readelf.py -d --project=test --comments='this is a test' /usr/bin/foo
All the other options, such as searching, recursion, and disabling static checking are also available in this mode, and the program will still output error conditions and the data to stdout.
If a browser does not open by launching the menu item, you can access the GUI (once the server is started): at http://127.0.0.1:8000/linkage.
The GUI interface is pretty straightforward, with tabs to access various aspects of program:
- Check Dependencies - Test entry, initiate form
- Review Results - Tabular list of existing test results
- Licenses - License/alias entry tab
- License Bindings - Define license bindings for targets (test files) and libraries
- License Policies - Define sets of target/dependency license policies, to be flagged during testing
- Settings - View and change other settings, and reload the static database
- Documentation - This documentation
A test sequence would typical start at the Check Dependencies page, where you enter the test criteria. This setup parallels the operation of the command line program, where you select whether to search for a file under a directory, test a whole directory, or just a single file. There is also a drop-down to select the recursion level. You can disable checking for static dependencies via a checkbox.
The user field is pre-populated with the compliance user, but can be overridden. The project and comments fields are optional for your use in tracking tests.
Once you enter the test criteria, click on the Run Dependency Check button. After the test runs you will be presented with the detailed test results in tabular form. Depending on the number of files to be tested and the recursion level, the test can take a few minutes, so be patient.
Until there are licenses and bindings defined, the results detail will show TBD for both the target and dependency licenses. Now that there is data in the system, you can go back and define these relationships and update the test data.
There is a Print Results button on the detail page that should open the browser print dialog to print to a physical printer or to a file. Some parts of the GUI are hidden in the printed output so that only the test results show up in the printed report.
The test results should also be accessible from the Review Results page. This is a tabular list of all the test runs, sorted by test id/date. The far-right column has the information entered from the Check Dependencies tab. Clicking on the link for the target file or directory will open the detail tab. If you want to delete test results, you can select the checkboxes and delete them from here, using the Delete Selected Tests button.
The License tab lets you enter license/version info. You enter the license name (example: GPL) in the left-hand field and the version (example: 3.0) in the right-hand field. Like the Review Results tab, you can select and delete licenses using the checkboxes and the Delete Selected Licenses button. The license-version combination will be concatenated in the report to look like: GPL 3.0.
This tab also contains the entry form to map the license/version info used by the application to any possible string variations provided by imported data from other sources. You can select a license/version from the system and then provide up to 9 alternate names (aliases) that will be considered equivalent when examining test results for policy violations. Additional aliases can be added to an existing list by simply selecting the license, entering just the new alias and clicking "Add" again.
The License Bindings tab lets you define the license binding for the target files, that is, the files that are being tested for dependencies. The same type of bindings can be done for the dependency libraries.
The drop down under Target will show all files having test data. The drop-down under License will show all the licenses defined in the License tab. If there is no test-data or no licenses, the drop-downs will be empty. If there is test data in the system, you can update the license information for current data using the Update Target Test Data button.
The drop down under Library will show all libraries in the test data. The drop-down under License will show all the licenses defined in the License tab. The License selector does not differentiate between static and dynamic versions, both will be treated the same. If there is no test-data or no licenses, the drop-downs will be empty. If there is test data in the system, you can update the license information for current data using the Update Library Test Data button.
The License Policies tab lets you define pairing of target/library licenses that could have potential issues. You select the Target License and Library License from the drop-downs and then select the relationship, either Static, Dynamic, or both. You can also set the state or either Approve or Disapprove. When a test is run, violations of these policy settings will show up the the report detail printed in red with a red flag after the License name. License pairings that are approved will have normal black text, and unknown/undefined pairings will be highlighted in orange with an orange flag. Like the other tabs, you can select and delete policies using the checkboxes and the Delete Selected Policies button.
In the screenshot below, you can see an example of a flagged policy violation. The application myapp has been compiled against libmylib.so. The licenses: L1 2.0 and L2 1.3 have been defined in the policy screen as being an issue. When the test data is displayed, this relationship is flagged as being problematic:
If the target (file) or library is using a license naming convention that is not defined in the application licenses tab, but has a naming convention defined as equivalent in the aliases table, the license violation will look like:
alias name (real name) [graphical flag]
The Settings tab lets you change the static data used to detect static libraries in use by the program being tested.
The symbol data used for static detection is based on the libraries currently installed on the test system. You can reload the data by activating the Reload Static Data button at the top of the page.
By default, system libraries from the normal system paths are loaded into the database for static symbol detection. The list of paths to search is provided in a large edit box, one per line; you can add or remove paths from this list, and activate the Save Changes button to save those search paths.
In the current configuration, the django admin interface is enabled. While you can use this interface to directly access the database records, one should take care not to alter existing records, except in the case of wishing to add license information to records.
admin interface: http://127.0.0.1:8000/admin (username compliance, password compliance)
The database for the application is in the file compliance in the compliance directory. It is an sqlite3 database file. Tables used by the application are as follows (arranged more or less as they are integrated into the application tabs):
- linkage_test - table containing the information entered from the Test tab. Each test has an id used to track the test and the relationships to the file and lib tables, although the id is not shown in the gui, only the date/time.
- id - test id, used for linking to the file and lib tables (primary key)
- do_search - boolean value used to determine whether to search a directory tree for a particular filename
- disable_static - boolean value used to determine whether to test for static dependencies
- test_done - boolean value used to suppressed test from the results page before it is completed
- recursion - recursion level for dependency checking, default is 1
- target - target file or directory
- target_dir - target directory when searching for a file by name
- test_date - test date/time
- user - username entered on the test form
- project - project name entered on the test form
- comments - free-form comments field from the test form
- linkage_file - target files examined during a test run. Linked to the test table via test_id.
- id - record id, not exposed in the gui (primary key)
- test_id - reference to id from the test table
- file - file name/path. In the recursive case, can be a library path.
- license - file license
- level - recursion level. Top-level file tested would be level 1. True paths to the dependent libraries on the system under test would show up here as higher levels when doing recursive testing (these files are not exposed in the GUI).
- parent_id - When doing recursion, each child file has a parent. This field captures that relationship.
- checked_static - flag to indicate whether static checking was done on this file
- linkage_lib - library dependencies of the files from the files table.
- id - record id, not exposed in the gui (primary key)
- test_id - reference to id from the test table
- file_id - reference to id from the file table
- library - library SONAME
- static - flag to indicate whether the library is static or dynamic
- license - library license
- level - recursion level
- parent_id - When doing recursion, each child library has a parent. This field captures that relationship.
- linkage_license - This table is not linked to any of the test data tables, but is used to populate the target/library licenses and license policy drop-downs, which in turn fills in the license data for a test and flags policy issues.
- id - record id, not exposed in the gui (primary key)
- longname - long version of license string (e.g. GNU Public License)
- license - abbreviated license string (e.g. GPL)
- version - license version number (e.g. 3.0)
- linkage_aliases - This table is not linked to any of the test data tables, but is used to map possible license string variations from outside sources to the names defined in the license policies. There is a many-to-one relationship between license and alias, with each alias entry needing to be unique.
- id - record id, not exposed in the gui (primary key)
- license - concatenation of license/version in the same form generated from the license table
- alias - any acceptable variation that is considered equivalent (e.g. GPLv3, GPL3)
- linkage_filelicense - target file/license bindings. These are not linked to any other table, but the information, if present, is used to fill in the license field in the file table after a test run, or you can manually update the data from the Target Licenses tab.
- id - record id, not exposed in the gui (primary key)
- file - file name/path, selected from the file table
- license - selected concatenation of license/version from the license table
- linkage_liblicense - library/license bindings. These are not linked to any other table, but the information, if present, is used to fill in the license field in the lib table after a test run, or you can manually update the data from the Library Licenses tab.
- id - record id, not exposed in the gui (primary key)
- library - library SONAME, selected from the lib table
- license - selected concatenation of license/version from the license table
- linkage_policy
- id - record id, not exposed in the gui (primary key)
- tlicense - target license selected from the concatenation of license/version from the license table
- dlicense - library license selected from the concatenation of license/version from the license table
- relationship - relationship string, either 'Static', 'Dynamic', or 'Both'
- rank - problem ranking - currently not used
- status - character flag for Approved (A) or Disapproved (D)
- edit_date - date/time the policy was entered
- linkage_staticsymbol
- id - record id, not exposed in the gui (primary key)
- symbol - symbol name. symbols extracted from the target under test will match this entry to get possible library sources
- libraryname - Library SONAME that provides this symbol
Because linkage_license, linkage_aliases, linkage_filelicense, linkage_liblicense, and linkage_policy are more or less independent of the test data, one could easily load these tables from other data sources, using sqlite3, as long as the existing table schemas are honored. To illustrate, let's walk through an example of importing a file library/license mappings from another source.
Say have a csv file of library,license data like this:
libfoo.so.6,LPGLv3 libbar.so.2,BSD1 libbaz.so.4,APACHE 2We can easily process this into SQL statements we can load into dep-checker using whatever script language you're comfortable with. With the shell and awk, perhaps something like this:
awk -F, '{print "INSERT INTO linkage_liblicense (library, license) VALUES (\"" $1 "\",\"" $2 "\");"}' < liblicenses.csv > liblicenses.sql cd dep-checker compliance sqlite3 compliance < liblicenses.sqlAnd if we look at the database now:
sqlite3 compliance SQLite version 3.6.23.1 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> select * from linkage_liblicense; 1|libfoo.so.6|LPGLv3 2|libbar.so.2|BSD1 3|libbaz.so.4|APACHE 2So our data is loaded, but we have a slight problem in that the license naming conventions from our data file don't match the format used in dep-checker to feed into our license polices. In dep-checker, "LGPLv3" would be "LGPL 3.0", "BSD1" would be "BSD 1.0" and "APACHE 2" might be "Apache 2.0". We can correct this either by define alias mappings in the Licenses tab or with some additional SQL (assuming we know the "correct" naming defined in dep-checker):
INSERT INTO linkage_aliases (license, alias) VALUES ('LGPL 3.0', 'LGPLv3'); INSERT INTO linkage_aliases (license, alias) VALUES ('BSD 1.0', 'BSD1'); INSERT INTO linkage_aliases (license, alias) VALUES ('Apache 2.0', 'APACHE 2');If you have a large number of alias mappings to perform, SQL may be the way to go, otherwise they can be assigned under Aliases in the Licenses tab where you'l be assured of the correct mappings, as only "known" licenses will be available in the drop-down.
A similar process could be used to load license associations for target files.
Once the license data is loaded, the functions update_file_bindings() and update_lib_bindings() in views.py would apply this information to existing test data, which can also be done from the gui Target Licenses and Library Licenses tabs by clicking the Update Test Data button.
As mentioned earlier, the command line program, readelf.py does all the actual file search and analysis. For discovering dependencies, it uses readelf, ldd, and file, using the following methodology:
- Search for file?
- Yes - start search at top of directory tree specified, for the file name in question
- No - start processing specified file or walk the whole specified directory tree
- File found - elf file? (using "file")
- No - return "not an elf file", continue processing more files or exit
- Yes - check whether static or dynamically linked
- Dynamic - run "readelf -d" on the file and capture all the SONAMES tagged as NEEDED, followup with static analysis
- Recursive analysis?
- Yes - run "ldd" on the test file to get the paths to the libraries associated with the SONAME, repeat the analysis on each of these libraries, and in turn their dependencies, stopping when we either reach the desired recursion level or glibc/ld-linux and no further recursion is possible
- No - output the results and exit or process the next file
- Static
- run "readelf -s" on the file and capture all the functions from the symbol list
- run "readelf -wi" on the file and capture all the functions from the debug info
- isolate the functions that have no debug info, these are considered as coming from another library
- look up possible sources of the function from the database and report
There are certain limitations in the analysis of binaries/libraries for static/dynamic dependencies.
In the static case, the symbol table is created either on the build server (packaged version), or the user's machine (run in-place from git). The content of the table will be largely driven by the libraries present on the system, and may not completely reflect the system where the target files have been built.
Also, the same symbol can come from one than one library, and the tool can only provide the possible sources of the symbol in question. Some deeper investigation of the actual build system may be required to identify the actual static linkage.
For the dynamic case, the level one dependencies are pretty clear-cut, but for the recursive case one can get a slightly different set of dependencies when drilling down into the system libraries, depending on how the system libraries themselves are built.
0.0.4: Change name to Dependency Checker Tool Drop Home, About tabs Drop copyright notice Add AUTHORS, Changelog Add bzr, bugzilla, mail list info to README.html Fix date/time display in detail, results Fix directory recursion New License file Add contribution requirements Rework dep recursion in command line app Fix test.html so it displays ok on most browsers Workaround weird display issue for results,detail in firefox 3.6.3 Fix detail page printing, suppress unwanted elements with css Add new logo Detect/use local timezone settings Add directory/file browsing to test input form 0.0.5: Add license/license policy tabs Rework documentation Add dep-server-stop.sh to stop running server instances Add target/library license binding tabs and tables More GUI tweaks for more tabs Insert file/library bindings into test data after run Add policy violation flagging to test results Expose --no-static in the GUI Add license<-->alias mapping/reporting capability (bug 459) Add approved/disapproved policy highlighting (bug 475) Pre-load with license/alias data (bug 478) Enable data collection from the command line (bug 472)
Any contribution submitted for inclusion in the Dependency Checker Tool must be signed by its author following the Developer's Certificate of Origin 1.1. By making a contribution to this project, I certify that: a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved. Patches to the mailing list need to be signed as: Signed-off-by: <author name> <author email address> Same thing applies for reviewers: Signed-off-by: <reviewer name> <reviewer email address> and committer: Signed-off-by: <committer name> <committer email address>
Copyright (c) 2010 Linux Foundation Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.