paint-brush
Python Crypto API Misuses in the Wild: Analyzing Threats to Validityby@cryptosovereignty
113 reads

Python Crypto API Misuses in the Wild: Analyzing Threats to Validity

tldt arrow

Too Long; Didn't Read

Validity concerns in crypto misuse studies include challenges in generalization, limitations in analysis capabilities (e.g., Babelfish's recursive depth), and potential biases in comparative studies due to different application types, domains, and analysis frameworks. These factors can impact the accuracy and relevance of findings in understanding crypto misuse patterns.
featured image - Python Crypto API Misuses in the Wild: Analyzing Threats to Validity
Crypto Sovereignty Through Technology, Math & Luck HackerNoon profile picture

Authors:

(1) Anna-Katharina Wickert, Technische Universität Darmstadt, Darmstadt, Germany ([email protected]);

(2) Lars Baumgärtner, Technische Universität Darmstadt, Darmstadt, Germany ([email protected]);

(3) Florian Breitfelder, Technische Universität Darmstadt, Darmstadt, Germany ([email protected]);

(4) Mira Mezini, Technische Universität Darmstadt, Darmstadt, Germany ([email protected]).

Abstract and 1 Introduction

2 Background

3 Design and Implementation of Licma and 3.1 Design

3.2 Implementation

4 Methodology and 4.1 Searching and Downloading Python Apps

4.2 Comparison with Previous Studies

5 Evaluation and 5.1 GitHub Python Projects

5.2 MicroPython

6 Comparison with previous studies

7 Threats to Validity

8 Related Work

9 Conclusion, Acknowledgments, and References

7 THREATS TO VALIDITY

We evaluated top GitHub Python projects and it may be that our results fail to generalize on specialized Python applications. For our data set on MicroPython applications, we also concentrated on popular projects. Thus, our insights may not generalize to less popular or closed-source projects. However, we believe that our results provide first interesting insights on crypto misuses in Python.


Currently, our analysis is limited to capabilities of Babelfish, especially the recursive maximum depth of its filter function. Furthermore, currently Babelfish only creates an AST for a single file. Thus, our analysis fails to resolve misuse over multiple files. We hope that these limitations can be lifted through further development of Babelfish. These improvements will hopefully help to reduce the number of false-positives in the potential misuses. Furthermore, it may be that our static analysis missed some misuses as Python is a dynamic typed language.


We compare different application types of studies conducted in different years. Thus, it may be that the results might change when conducted on the same kind of applications now. Further, the results may differ due to the effect of different application domains and different analysis frameworks. Moreover, the percentages of applications with at least one misuse per rule that we used from Zhang et al. [13] might be too positive for C, as the number of firmware images with crypto usages is not explicitly reported.


This paper is available on arxiv under CC BY 4.0 DEED license.