In this article, I will walk you through:1. The signs of need for refactoring.2. What is considered refactoring?3. What to do before refactoring?4. Step by step guide to restructuring Python package The examples are written specifically with Python in mind. However, the general principles shall hold true across all languages. The signs of need for refactoring Your versions are no longer supported. The best written code will still be obsolete after some time. You want the new features in newer versions of frameworks, libraries or even programming language. Your code was written quick and dirty with tons of technical debts. Your code is hard to navigate. Countless indirect access. Not comply with standards, style guides, conventions. What is considered as refactoring? The term refactor, covers a broad array of actions and definitions. While the term itself generally lead to a common goal, which is a cleaner, better code base, there are still a lot of actions can be considered as ‘refactoring’. Upgrade your dependencies to newer version Rename your code functions, classes, modules, etc Reorganize your code base, moving a function from a file to another Improve implementation of a function to increase performance Reformat your code to make it standards compliant Before Refactoring In spite of distinction from one programming language to another, there are generally some steps and principles to be held while refactoring. Step 1: Get A Glimpse of Your Code Quality. Run static analysis tools. Static analysis tools gives you a summary report with quantitative statistics that can be compared while refactoring from time to time. In Python, I personally use , a static analysis suite encompasses other tools such as pep8, Pylint, etc. The best thing I like about Prospector is its ability to detect and adapt to the frameworks I use (Django, Celery). Prospector There are 3000+ messages found. The messages found should be vastly reduced after refactoring. Do note that not all messages found are valid, there is a chance of false positive. Step 2: Prepare and Verify Test Cases Code without test is bad code by design. Do you have test cases in place? If so, how complete are they? Are those test cases up to date? Why do we need test cases before refactoring? You can argue that you won’t need any, even if you are doing a simple change such as adding validation. Well, trust me, you gonna get yourself caught by unexpected behavior responded by the masterpiece you touched up. I’m not gonna convince you the importance of having test codes in general. Test code before refactoring is to ensure the behavior of your system is consistent after refactoring. Even if there are test cases in place, which gives you a green-light, you should still verify the test code. Let me tell you why. Imagine you run whatever tests available before you attempt to refactor, and you get the following results. Tests all passed. Nope, they look like all passed. However, when you dig further, you found this test case. """Package: utils.testsFaulty Test Cases""" import unittest class UtilsTestCase(unittest.TestCase):def setUp(self):pass def test_is_empty(self):from utils.common.helpers import is_empty # Problem: there is no assertion is\_empty('') is\_empty(None) is\_empty('', object\_type='json') is\_empty('{}', object\_type='json') def test_is_ok(self):# Problem: this is empty.pass def test_is_number(self):# Problem: this will pass.if not is_empty(''):print('Fail') So, you see the point of verifying it? Besides, test cases are often the best documentation for a software. While navigating within codes, they are also your best GPS navigator. Start Refactoring — Reorganizing / Restructuring In this section, I will walk you through an example by reorganizing the structure, merging duplicated methods, decomposing and writing test code to ensure backward compatibility. config.py looks like this: config.py """Package: utils.configbefore restructuring""" CONFIG_NAME = {"ENABLE_LOGGING": "enable_logging","LOGGING_LEVEL": "logging_level",} def get_logging_level():pass class ConfigHelper:def get(self, config_name, default=None):pass def set(self, config_name, value):pass def _get_settings_helper(self):pass def get_logging_level():pass def is_logging_enabled():pass class LOGGING_LEVEL:VERBOSE = "verbose"STANDARD = "standard" Step 1: Write Backward Compatible Code This step is crucial. Before refactoring our code, test cases MUST be in place. In this case, we write backward compatible code to ensure all references to the classes/functions/constants are still working. In , we shall redefine the class/method signatures: __init__.py """in __init__.pyThis is where backward compatibility code lives.This is to ensure the refactored package supportsold way of import.This is incomplete, we will revisit __init__.py later""" CONFIG_NAME = {} def get_logging_level(*args, **kwargs):pass class ConfigHelper:def get(self, *args, **kwargs):pass def set(self, \*args, \*\*kwargs): pass def \_get\_settings\_helper(self): pass def get\_logging\_level(self): pass def is\_logging\_enabled(self): pass class LOGGING_LEVEL:pass The is incomplete for now. We will revisit the file later. Next, we write a test case to make sure we can still import the package as if we are importing the old package. __init__.py """in tests.pySimple backward compatibility test case""" class ConfigHelperCompatibilityTestCase(unittest.TestCase):def test_backward_compatibility(self):try:from .config import CONFIG_NAME, LOGGING_LEVELfrom .config import get_logging_levelfrom .config import ConfigHelperexcept ImportError as e:self.fail(e.message) This is a simple test case, you may notice some backward compatibility issues are not caught in the test case. Step 2: Reorganizing Package Structure This section gives you an idea on how you can reorganize your Python package. Let’s revisit the we have: config.py """Package: utils.configbefore restructuring""" CONFIG_NAME = {"ENABLE_LOGGING": "enable_logging","LOGGING_LEVEL": "logging_level",} def get_logging_level():pass class ConfigHelper:def get(self, config_name, default=None):pass def set(self, config\_name, value): pass def \_get\_settings\_helper(self): pass def get\_logging\_level(): pass def is\_logging\_enabled(): pass class LOGGING_LEVEL:VERBOSE = "verbose"STANDARD = "standard" Can you spot what’s wrong here? It is messy, there are constants, helpers, duplicated codes in a single file. When the code in grows larger, it will become increasing difficult to navigate within. With this messy structure, you are breeding a spot for circular dependency, hidden coupling and refining the recipe for the tastiest spaghetti code. config.py How can you reorganize ? To me, separation of concerns comes across my mind. The following structure is often considered a good practice to structure Python package (this structure is used in Django as well). config.py config/├── abstracts.py # All the abstract classes should live here├── constants.py # All the constants should live here├── exceptions.py # All custom exceptions should live here├── helpers.py # All helpers should live here├── __init__.py # All backward compatible code in here├── mixins.py # All mixins goes to here├── serializers.py # All common serializers goes to here└── tests.py # All `config` related tests should live here Let’s revisit our before refactoring and identify where the individual piece of code should reside. config.py """Package: utils.configbefore restructuring""" # This looks like belongs to utils.config.constantsCONFIG_NAME = {"ENABLE_LOGGING": "enable_logging","LOGGING_LEVEL": "logging_level",} # This looks like a helper function, goes to utils.config.helpersdef get_logging_level():# This looks like a duplicate methodpass # This looks like a helper class, goes to utils.config.helpersclass ConfigHelper:def get(self, config_name, default=None):pass def set(self, config\_name, value): pass def \_get\_settings\_helper(self): pass def get\_logging\_level(): # This looks like a duplicate method pass def is\_logging\_enabled(): pass # This looks like another constant, goes to utils.config.constantsclass LOGGING_LEVEL:VERBOSE = "verbose"STANDARD = "standard" After refactoring, should become a Python package with a in it. config.py config __init__.py utils/ ├──config.py # To be removed └──config/ ├── constants.py ├── helpers.py ├── __init__.py └── tests.py In : utils.config.constants """Package: utils.config.constantsafter restructuring""" # Inconsistent programming constructCONFIG_NAME = {"ENABLE_LOGGING": "enable_logging","LOGGING_LEVEL": "logging_level",} # Inconsistent programming constructclass LOGGING_LEVEL:VERBOSE = "verbose"STANDARD = "standard" In : utils.config.helpers """Package: utils.config.constantsafter restructuring""" def get_logging_level():# This is duplicate, removing thispass class ConfigHelper:def get(self, config_name, default=None):pass def set(self, config\_name, value): pass def \_get\_settings\_helper(self): pass def get\_logging\_level(): pass def is\_logging\_enabled(): pass Step 3: Eliminate and Merging Duplicates In , there are 2 similar methods/functions and . Assuming both implementations are identical, it means we have to find a best place to host the function. utils.config.helpers get_logging_level() ConfigHelper()._get_logging_level() In this case, I remove the standalone and keep the one in . get_logging_level() ConfigHelper """Package: utils.config.constantsafter removing duplicates""" class ConfigHelper:def get(self, config_name, default=None):pass def set(self, config\_name, value): pass def \_get\_settings\_helper(self): pass def get\_logging\_level(): pass def is\_logging\_enabled(): pass Step 4: Decomposing Personally, I’m a fan of decomposition. Instead of having a single class , we can further decompose into a hierarchy of classes and mixins. ConfigHelper ConfigHelper Hierarchy of Decomposed ConfigHelper We host our in : AbstractBaseConfigHelper abstracts.py """in abstracts.py"""from abc import ABCMeta class AbstractBaseConfigHelper:__metaclass__ = ABCMeta def get(self, config\_name): pass def set(self, config\_name, value): pass def \_get\_settings\_helper(self): pass In : mixins.py """in mixins.py""" class LoggingConfigMixin:def is_logging_enabled():pass def get\_logging\_level(): pass In : helpers.py """in helpers.py""" class ConfigHelper(AbstractBaseConfigHelper,LoggingConfigMixin):pass is now decomposed into multiple classes and mixins. ConfigHelper Step 5: Complete Our Backward Compatibility Code In Step 1, we added some code in However, it is largely incomplete. Let’s revisit the file: __init__.py. """in __init__.pyThis is where backward compatibility code lives.This is to ensure the refactored package supportsold way of import.This is incomplete, we will revisit __init__.py later""" CONFIG_NAME = {} def get_logging_level(*args, **kwargs):pass class ConfigHelper:def get(self, *args, **kwargs):pass def set(self, \*args, \*\*kwargs): pass def \_get\_settings\_helper(self): pass def get\_logging\_level(self): pass def is\_logging\_enabled(self): pass class LOGGING_LEVEL:pass Notice that the bridge between the code above and our newly organized package is still missing. To establish the bridge, we edit our into: config __init__.py """in __init__.pyThis is where backward compatibility code lives.This is to ensure the refactored package supportsold way of import."""from .constants import CONFIG_NAME, LOGGING_LEVELfrom .helpers import ConfigHelper as _ConfigHelper def get_logging_level(*args, **kwargs):return _ConfigHelper().get_logging_level() class ConfigHelper(_ConfigHelper):pass Step 6: Notify The Developer Up to Step 5, our is properly refactored. However, we need to keep the developers notified about the change. Is there any straightforward way? Yes. We can emit a warning message whenever a developer is trying to import an obsolete function/class/method. For example, we annotate the old functions/classes/methods with decorators: config """decorators.py""" def refactored_class(message):def cls_wrapper(cls):class Wrapped(cls, object):def __init__(self, *args, **kwargs):warnings.warn(message, FutureWarning)super(Wrapped, self).__init__(*args, **kwargs)return Wrappedreturn cls_wrapper def refactored(message):def decorator(func):def emit_warning(*args, **kwargs):warnings.warn(message, FutureWarning)return func(*args, **kwargs)return emit_warningreturn decorator In our , we add decorator like this: __init__.py """in __init__.pyThis is where backward compatibility code lives.This is to ensure the refactored package supportsold way of import."""from .constants import CONFIG_NAME, LOGGING_LEVELfrom .helpers import ConfigHelper as _ConfigHelper @refactored('get_logging_level() is refactored and deprecated.')def get_logging_level(*args, **kwargs):return _ConfigHelper().get_logging_level() @refactored_class('config.ConfigHelper is refactored and deprecated. Please use config.helpers.ConfigHelper')class ConfigHelper(_ConfigHelper):pass After Restructuring After restructuring our Python package, we run our test case and make sure it’s all passed. Conclusion Up to this point, you should be able to understand the quality of your code base, understand the concept of refactoring, identify the need of refactoring, and understand how can one restructure/reorganize a Python package. If you find this useful, feel free to give me some claps.
Share Your Thoughts