How to Split Paragraphs into Sentences Using JavaScript

Robin
Updated on April 14, 2023

We often want to split a paragraph into individual sentences for analysis or formatting purposes. You can use built-in JavaScript methods or a regular expression (RegEx) to break your string into sentences quickly and easily.

Today you will learn 2 different ways to achieve this. These are:

  • Only using JavaScript methods.
  • Using regular expressions with the split() method.

In this blog post, we'll walk through a step-by-step guide on splitting paragraphs into an array of sentences using JavaScript with complete code examples and explanations of those methods.

Let's dive in!

Split Paragraphs into Sentences Using JavaScript

When you want to split your paragraphs into sentences only using JavaScript built-in methods, you have to use a few of them. Identify the start and end position of a sentence and extract it with the substring() method.

This process requires a few steps to complete. These steps are:

  • Set the starting position of the sentence (it will be 0 in the beginning).
  • Loop through each character of a paragraph or string.
  • Check if the current character is a period, question mark, or exclamation point inside the loop.
  • Extract the sentence from the paragraph using substring() method.
  • Update the starting position value with the current index inside the loop.

Let's see an example and understand each step in detail.

          const paragraph = `JavaScript is a powerful programming language used to create dynamic and interactive websites. Did you know that it is also used for server-side programming? That's right, you can now use it for both client-side and server-side programming! It's a great time to learn this programming language.`

const sentences = []
let start = 0

for (let i = 0; i < paragraph.length; i++) {
    if (paragraph[i] === '.' || paragraph[i] === '?' || paragraph[i] === '!') {
        const sentence = paragraph.substring(start, i + 1).trim()

        sentences.push(sentence)

        start = i + 1
    }
}

console.log(sentences)
        

Output:

          [
  "JavaScript is a powerful programming language used to create dynamic and interactive websites.",
  "Did you know that it is also used for server-side programming?",
  "That's right, you can now use it for both client-side and server-side programming!",
  "It's a great time to learn this programming language."
]
        

Here I have a paragraph of text that I want to split into individual sentences and an empty array to store those sentences.

Define a variable start to track the starting position of each sentence in the paragraph. The initial value of this variable will be 0.

Then use a for loop to iterate through each character in the paragraph. Inside the loop, check if the current character is a period ('.'), question mark ('?'), or exclamation point ('!') with an if statement.

If the current character is one of these punctuation marks, you know that you have reached the end of a sentence.

You can use the substring() method to extract the sentence from the paragraph, starting at the start position and ending at the current position (i + 1).

You also use the trim() method to remove any extra whitespace from the beginning or end of the sentence. Once you have extracted the sentence, you can push it to the sentences array using the push() method.

Finally, you need to update the start position value to the current position plus 1, so that we can start the next sentence from the correct position.

After completing the for loop, the sentences array will contain all of the sentences from the original paragraph.

Also Read: Capitalize The First Letter of Each Sentence in JavaScript


Use RegEx to Split Strings into Sentences in JavaScript

As you can see previous technique requires a lot of code and it also looks a little complex. But if you want to split the text into sentences easily with a single line of code, you can use a regular expression with the split() method.

          const paragraph = `JavaScript is a powerful programming language used to create dynamic and interactive websites. Did you know that it is also used for server-side programming? That's right, you can now use it for both client-side and server-side programming! It's a great time to learn this programming language.`

const sentences = paragraph.split(/(?<=[.!?])\s+/)

console.log(sentences)
        

Output:

          [
  "JavaScript is a powerful programming language used to create dynamic and interactive websites.",
  "Did you know that it is also used for server-side programming?",
  "That's right, you can now use it for both client-side and server-side programming!",
  "It's a great time to learn this programming language."
]
        

The regular expression /(?<=[.!?])\s+/ matches any whitespace character that comes after a period ('.'), question mark ('?'), or exclamation point ('!').

The syntax is using the (?<=pattern) for the regular expression which is known as a positive lookbehind assertion.

When we apply this regular expression to the paragraph string using the split() method, it will split the string at each whitespace character that comes after a period, question mark, or exclamation point.

This will effectively split the paragraph into an array of individual sentences.

Also Read: How to Capitalize the First Letter of Each Word in JavaScript


Conclusion

Splitting a paragraph into individual sentences can be a useful task in many JavaScript applications. By leveraging built-in string methods, we can easily extract sentences and format them as needed.

Additionally, you have seen how to use regular expressions to split sentences from a string based on different punctuation marks. It is a lot cleaner and needs less code.

Related Posts