# When You SHOULD Duplicate Code

Recently I published the article [Prevent Duplicated TypeScript Code](https://javascript.plainenglish.io/dry-your-wet-typescript-code-e3c777b3daf9). A good friend of mine approached me and said:

> “Great article, I learned a lot!”

I was a little bit astonished because he is a way better developer than me. I asked:

> “What exactly did you learn?”

He replied:

> “I never thought about the difference between code duplication and knowledge duplication. It was great that you focused on differentiating this.”

He was talking about **accidental duplication**. Removing it will make your code harder to read and harder to change in the future.

> **Accidental duplication is code that looks similar but represents different logic.**

The fact that my friend had the same perception as I had while writing the post made me write this article to explain with some examples how to identify the difference between **essential duplication** and **accidental duplication.**

### What is the DRY principle about?

If you want to have a detailed introduction to the **“Don’t repeat yourself” (DRY) principle** I suggest reading my [recent article](https://javascript.plainenglish.io/dry-your-wet-typescript-code-e3c777b3daf9) first. This article will only give a short introduction to the principle and focuses on the pitfalls of applying the DRY principle.

---

The DRY principle states that duplication and repetition in code, that exists in two places and **repeats the same knowledge and business logic** should be avoided. The principle is credited to Andy Hunt and Dave Thomas and is stated in their book [The Pragmatic Programmer](https://www.amazon.com/gp/product/0135957052/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=0135957052&linkCode=as2&tag=webhighlights-20&linkId=0bdbf88ce0a118807c4b3cc3edfb863f):

> *“Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.” —* Andy Hunt and Dave Thomas

Notice that Andy Hunt and Dave Thomas are pointing out “every **piece ofknowledge**” and **not** “every **piece of code**”. Understanding the difference is essential to identify accidental duplication.

### Knowledge Duplication

> “Are we looking at syntax duplication or knowledge duplication?” — Anthony Sciamanna

To spot knowledge duplication basically, two questions have to be answered affirmatively:

1. *Does code exist that looks identical?*
    
2. *Does the code repeat the same knowledge and logic?*
    

---

#### #1 Example: Knowledge Duplication

Let’s have a look at this example:

#1 Example: Knowledge Duplication

We have two `signUp` functions. One in the client and the other in our server. Let’s answer our questions:

#### 1\. Does code exist that looks identical? ✅

The alarm bells should ring here for every developer. It is obvious that both functions are sharing duplicated code.

#### 2\. Does the code repeat the same knowledge and logic? ✅

This question is the harder one. If you are not sure you can ask a replacement question: *Does changing one code block lead to changing the other one?*

> “There is true duplication, in which every change to one instance necessitates the same change to every duplicate of that instance.” — Robert C. Martin

*I* consciously chose an easy example here. Obviously, the validation of credentials is the same for the `signUp` function in the client as the one in the server. If we for example chose to increase the needed password length to 7 we would need to change it in two places. If we forgot to change it in one place, it would lead to a bug.

---

Since both questions are affirmed, we can confirm that there is some knowledge duplication in the code. Let\`s refactor our code by extracting our code into helper functions. Since the client and the server are living in different environments we have to [create a shared library](https://javascript.plainenglish.io/share-code-between-react-client-and-express-server-5dc0977faa76).

**#1 Example: Knowledge Duplication**

Notice that each function has only one responsibility to comply with the SOLID principles, in particular the **Single Responsibility Principle**.

> “The Single Responsibility Principle states that a given method/class/component should have a single reason to change” — Robert C. Martin in [Clean Code](https://www.amazon.com/gp/product/0132350882/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=0132350882&linkCode=as2&tag=webhighlights-20&linkId=384bd36dd0f905173f14773389477972)

Looks much better and cleaner, right? Now we could change our credential validation in one single source of truth.

### Accidental Duplication

> Duplication is far cheaper than the wrong abstraction

Recognizing the difference between knowledge duplication and accidental duplication is even hard for experienced developers because a domain understanding of the code is needed. That is also why **static analysis tools** are great to detect duplicated syntax but they can not tell (at least not yet) if it is also a **duplication of knowledge**.

---

#### #2 Example: Accidental Duplication

Let\`s start with another example by looking at this code:

#2 Example: Accidental Duplication

And again, let’s check our questions:

#### 1\. Does code exist that looks identical? ✅

As well as the first example, this code contains some duplicated syntax. Both, the `ProductService` as well as the `FeedbackService` are duplicating this code block:

#2 Example: Accidental Duplication

We could easily outsource this block of code to the abstract superclass `CRUDService` . That would save us a few lines of duplicated code.

#### 2\. Does the code repeat the same knowledge and logic? ❌

Let\`s start by asking the replacement question: *Does changing one code block lead to changing the other one?*

Presuming our business logic changes:

> *Feedback can now be created by every user while creating products still requires having the right permissions.*

This would lead us to change the `FeedbackService` class but the `ProductService` class does not need to be changed. This means that we identified **accidental duplication** and we should not abstract our code.

> “If two apparently duplicated sections of code evolve along different paths — if they change at different rates, and for different reasons — then they are not true duplicates” — Robert C. Martin

Imagine having abstracted our code by cleaning up the duplicated code and not only our two example classes are inheriting from our superclass but many more. We would have ended up un-refactoring our code because we cleaned up accidental duplication. This is much worse than having some duplicated code.

### Minimizing Accidental Duplication

Often you just can’t entirely remove accidental duplication, but you can minimize it by complying with the SOLID principles and using good naming.

#### #3 Example: Minimize Accidental Duplication

Let’s illustrate how to minimize accidental duplication by looking at this example:

#3 Example: Minimizing Accidental Duplication

#### 1\. Does code exist that looks identical? ✅

Obviously, the functions `toFileName` and `toFolderName` are both transforming a string to an underscore string. The code is exactly the same, therefore duplicated code exists.

#### 2\. Does the code repeat the same knowledge and logic? ❓

To answer this question let\`s go ahead and apply the DRY principle **incorrectly.**

We have noticed that the functions `toFileName` and `toFolderName` are identical. The obvious solution would be to merge both functions into one `toFileOrFolderName` function. Now we can use it in our `saveFile` method:

#3 Example: Minimizing Accidental Duplication

---

***Why is that wrong?***

Imagine it was decided that file names should now also have a timestamp in front of their name. A new developer in the project should implement this new requirement and is facing our `toFileOrFolderName` function. What I often see is a solution like this:

#3 Example: Minimizing Accidental Duplication

By eliminating our original functions we have caused our new function `toFileOrFolderName` to do more than one thing, that is violating the **Single Responsibility Principle**.

The fact that our original two functions do the same thing is an **accident**. One transforms file names and the other folder names. We don\`t want the caller of `toFileName` to know anything about folder names and we don\`t want the caller of `toFolderName` to know anything about file names.

---

***How to minimize the duplication correctly?***

Let\`s restore our initial `toFileName` and `toFolderName` functions and get rid of the low-level duplication by creating the function `toUnderscore` :

#3 Example: Minimizing Accidental Duplication

By following the **Single Responsibility Principle** each function maintains a single level of abstraction and has only one single reason to change.

Even though we didn\`t eliminate the accidental duplication entirely we minimized it by getting rid of the low-level duplication of transforming a string to underscore.

---

Theoretically, we should now be able to differentiate between essential and accidental duplication… **But what if we are not sure?**

### **Rule of Three**

> “Three strikes and you refactor”

Source: [https://giphy.com/gifs/dallasmavs-YlkCKp0EttSa7jxg6y](https://giphy.com/gifs/dallasmavs-YlkCKp0EttSa7jxg6y)

Spotting knowledge duplication isn’t easy and cleaning up accidental duplication is far more harmful than having duplicated code.

The Rule of Three 3️⃣ basically defines that when you spot some duplicated code and the first two cases aren’t enough to clearly identify shared knowledge, **wait for the third duplicate** before you refactor.

> “It’s really hard and feels terrible, but close your eyes and try it anyway.” — [Justin Weiss](https://www.justinweiss.com/articles/i-dry-ed-up-my-code-and-now-its-hard-to-work-with-what-happened/)

Martin Fowler defined the Rule of Three in his book [*Refactoring: Improving the Design of Existing Code*](https://www.amazon.com/gp/product/0134757599/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=0134757599&linkCode=as2&tag=webhighlights-20&linkId=1382f4638ff19240dda8604060ae8809)*:*

* *The first time you do something, you just do it.*
    
* *The second time you do something similar, you wince at the duplication, but you do the duplicate thing anyway.*
    
* *The third time you do something similar, you refactor.*
    

### Final Thoughts

Duplicated code is one of the major reasons for technical debt and bugs in software. That\`s why the DRY principle is one of the most valuable ones in software development. But, applying it correctly is even more important.

Remember, whenever you spot dome duplicated code ask yourself: *“Am I looking at****duplicated syntax****or****duplicated knowledge***\*?”.\* And if you are not sure, apply the **Rule of Three**.

---

Thanks for reading!

Want to learn more about how I scaled my [Chrome Extension](https://chromewebstore.google.com/detail/web-highlights-pdf-web-hi/hldjnlbobkdkghfidgoecgmklcemanhm) to almost 100,000 users as a solopreneur? [Subscribe to my stories](https://medium.com/@mariusbongarts/subscribe) or follow me on [LinkedIn](https://www.linkedin.com/in/marius-bongarts-6b3638171/) and [Twitter](https://twitter.com/MariusBongarts).

If you read a lot online, make sure to check out my [Chrome Extension](https://chromewebstore.google.com/detail/web-highlights-pdf-web-hi/hldjnlbobkdkghfidgoecgmklcemanhm) loved by 90,000+ active user — it’s free:

%[https://web-highlights.com/]
