/summarizer

๐Ÿ“ƒ Summarize article with textrank for korean

Primary LanguageJava

Summarizer

๐Ÿ“ƒ ํŽ˜์ด์ง€ ๋žญํฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ™œ์šฉํ•˜์—ฌ ํ…์ŠคํŠธ ๋ณธ๋ฌธ์„ ํŠน์ • ๋น„์œจ๋กœ ์š”์•ฝํ•ด์ฃผ๋Š” ํ”„๋กœ๊ทธ๋žจ ์ž…๋‹ˆ๋‹ค.

์‚ฌ์šฉ๋ฐฉ๋ฒ•

์•„๋ž˜๋Š” Summarizer ๊ฐ์ฒด๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ํ…Œ์ŠคํŠธ ์ฝ”๋“œ ์ž…๋‹ˆ๋‹ค.

AppTest.java

package us.narin.summarizer;

import junit.framework.Test;
import junit.framework.TestCase;
import junit.framework.TestSuite;

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

/**
 * Unit test for simple Summarizer.
 */
public class AppTest
        extends TestCase {
    /**
     * Create the test case
     *
     * @param testName name of the test case
     */
    public AppTest(String testName) {
        super(testName);
        try {
            Summarizer summarizer = new Summarizer(new Scanner(new File("./test.txt")).useDelimiter("\\Z").next());
            System.out.println(summarizer.summarize());
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }

    /**
     * @return the suite of tests being tested
     */
    public static Test suite() {
        return new TestSuite(AppTest.class);
    }

    /**
     * Rigourous Test :-)
     */
    public void testApp() {
        assertTrue(true);
    }
}

test.txt

๋ฐฐ์šฐ ๊น€์šฐ๋นˆ์ด ๋น„์ธ๋‘์•” ํˆฌ๋ณ‘ ์ค‘์ธ ๊ฐ€์šด๋ฐ 1์ฐจ ํ•ญ์•”์น˜๋ฃŒ๋ฅผ ๋งˆ์นœ ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์กŒ๋‹ค.

๊น€์šฐ๋นˆ์˜ ์†Œ์†์‚ฌ ์‹ธ์ด๋”์ŠคHQ ์ธก์€ "๊น€์šฐ๋นˆ์ด ์ตœ๊ทผ 1์ฐจ ํ•ญ์•”์น˜๋ฃŒ๋ฅผ ๋งˆ์ณค๋‹ค. ์•„์ง ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ค์ง€ ์•Š์•„ ๋ญ๋ผ ๋ฐํžˆ๊ธฐ๊ฐ€ ์กฐ์‹ฌ์Šค๋Ÿฌ์šด ์ƒํ™ฉ์ด๋‹ค. ๊ฒฐ๊ณผ๋ฅผ ๊ธฐ๋‹ค๋ฆฌ๋Š” ์ค‘์ด๋‹ค"๋ผ๊ณ  ๋ฐํ˜”๋‹ค.

์•ž์„œ ๊น€์šฐ๋นˆ์€ ์ง€๋‚œ 5์›” ๋ชธ์— ์ด์ƒ์„ ๋Š๊ปด ์ฐพ์•˜๋˜ ๋ณ‘์›์—์„œ ๋น„์ธ๋‘์•” ์ง„๋‹จ์„ ๋ฐ›๊ณ  ํˆฌ๋ณ‘ ์ค‘์ด๋‹ค.

๊ฐ‘์ž‘์Šค๋Ÿฐ ๊น€์šฐ๋นˆ์˜ ๋น„์ธ๋‘์•” ํˆฌ๋ณ‘ ์†Œ์‹์ด ์ „ํ•ด์ง€์ž ํŒฌ๋“ค์€ ๋‹นํ˜น๊ฐ์„ ๊ฐ์ถ”์ง€ ๋ชปํ–ˆ๋‹ค.

ํŠนํžˆ ๊น€์šฐ๋นˆ์ด ๋น„์ธ๋‘์•” ์ง„๋‹จ ๋ฐ›๊ธฐ ์ „ KBS 2TV ๋“œ๋ผ๋งˆ 'ํ•จ๋ถ€๋กœ ์• ํ‹‹ํ•˜๊ฒŒ'์—์„œ ์‹œํ•œ๋ถ€ ์‹ ์ค€์˜ ์—ญ์„ ๋งก์€ ๋ฐ” ์žˆ๋‹ค.

์‹ ์ค€์˜ ์—ญ์˜ ๊น€์šฐ๋นˆ์€ ๊ทน ์ค‘ "์‹œ๊ฐ„์˜ ์œ ํ•œํ•จ์„ ์•ˆ๋‹ค๋Š” ๊ฑด ์Šฌํ”„๊ณ  ๊ดด๋กœ์šด ์ผ์ด ์•„๋‹ˆ๋ผ ์ˆจ๊ฒจ์™”๋˜ ์ง„์‹ฌ์„ ๋“œ๋Ÿฌ๋‚ด๊ณ  ์šฉ๊ธฐ๋ฅผ ๋‚ผ ์ˆ˜ ์žˆ๊ฒŒ ํ•˜๋Š”, ๋‚ด๊ฒŒ ์ฃผ์–ด์ง„ ๋งˆ์ง€๋ง‰ ์ถ•๋ณต์ธ์ง€๋„ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค"๋ผ๋Š” ๋…๋ฐฑ์„ ํ•˜๊ธฐ๋„ ํ–ˆ๋‹ค.

ํ•ด๋‹น ๋Œ€์‚ฌ๋Š” ๊น€์šฐ๋นˆ์˜ ๋น„์ธ๋‘์•” ์ง„๋‹จ๊ณผ ๋งž๋ฌผ๋ฆฌ๋ฉฐ ๋งŽ์€ ์ด๋“ค์—๊ฒŒ ๊ณต๊ฐ๊ณผ ์œ„๋กœ๋ฅผ ์•ˆ๊ฒผ๋‹ค.

์ถœ๋ ฅ ๊ฒฐ๊ณผ

[๋ฐฐ์šฐ ๊น€์šฐ๋นˆ์ด ๋น„์ธ๋‘์•” ํˆฌ๋ณ‘ ์ค‘์ธ ๊ฐ€์šด๋ฐ 1์ฐจ ํ•ญ์•”์น˜๋ฃŒ๋ฅผ ๋งˆ์นœ ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์กŒ๋‹ค ., ์•ž์„œ ๊น€์šฐ๋นˆ์€ ์ง€๋‚œ 5์›” ๋ชธ์— ์ด์ƒ์„ ๋Š๊ปด ์ฐพ์•˜๋˜ ๋ณ‘์›์—์„œ ๋น„์ธ๋‘์•” ์ง„๋‹จ์„ ๋ฐ›๊ณ  ํˆฌ๋ณ‘ ์ค‘์ด๋‹ค ., ๊ฐ‘์ž‘์Šค๋Ÿฐ ๊น€์šฐ๋นˆ์˜ ๋น„์ธ๋‘์•” ํˆฌ๋ณ‘ ์†Œ์‹์ด ์ „ํ•ด์ง€์ž ํŒฌ๋“ค์€ ๋‹นํ˜น๊ฐ์„ ๊ฐ์ถ”์ง€ ๋ชปํ–ˆ๋‹ค .]

์˜์กด์„ฑ ์„ค์น˜

๋Œ€ํ•œ๋ฏผ๊ตญ์—์„œ ํ˜„์กดํ•˜๋Š” ๋Œ€๋ถ€๋ถ„์˜ ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด KoalaNLP ๋ฅผ ์‚ฌ์šฉํ•˜์˜€๊ณ , ๊ทธ๋ž˜ํ”„ ๊ตฌํ˜„์„ ์œ„ํ•ด jgrpaht ๋ฅผ ์‚ฌ์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค.

๊ธฐ๋ณธ์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ๋Š” ํ•œ๋‚˜๋ˆ” ํ•œ๊ตญ์–ด ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ ์ž…๋‹ˆ๋‹ค.

pom.xml

 <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>kr.bydelta</groupId>
      <artifactId>koalanlp-hannanum_2.12</artifactId>
      <classifier>assembly</classifier>
      <version>1.5.4</version>
    </dependency>
    <dependency>
      <groupId>kr.bydelta</groupId>
      <artifactId>koalanlp-twitter_2.12</artifactId>
      <version>1.5.4</version>
    </dependency>
    <dependency>
      <groupId>kr.bydelta</groupId>
      <artifactId>koalanlp-komoran_2.11</artifactId>
      <version>1.5.1</version>
    </dependency>
    <dependency>
      <groupId>kr.bydelta</groupId>
      <artifactId>koalanlp-eunjeon_2.12</artifactId>
      <version>1.5.4</version>
    </dependency>
    <dependency>
      <groupId>kr.bydelta</groupId>
      <artifactId>koalanlp-kkma_2.12</artifactId>
      <classifier>assembly</classifier>
      <version>1.5.4</version>
    </dependency>
    <dependency>
      <groupId>kr.bydelta</groupId>
      <artifactId>koalanlp-komoran_2.12</artifactId>
      <classifier>assembly</classifier>
      <version>1.5.4</version>
    </dependency>
    <dependency>
      <groupId>kr.bydelta</groupId>
      <artifactId>koalanlp-core_2.12</artifactId>
      <version>1.5.4</version>
    </dependency>
    <dependency>
      <groupId>kr.bydelta</groupId>
      <artifactId>koalanlp-kryo_2.12</artifactId>
      <version>1.5.4</version>
    </dependency>
    <dependency>
      <groupId>net.sf.jung</groupId>
      <artifactId>jung-api</artifactId>
      <version>2.1.1</version>
    </dependency>
    <dependency>
      <groupId>net.sf.jung</groupId>
      <artifactId>jung-graph-impl</artifactId>
      <version>2.1.1</version>
    </dependency>
    <dependency>
      <groupId>org.jgrapht</groupId>
      <artifactId>jgrapht-core</artifactId>
      <version>1.0.1</version>
    </dependency>
    <dependency>
      <groupId>jgraph</groupId>
      <artifactId>jgraph</artifactId>
      <version>5.13.0.0</version>
    </dependency>

์œ„์˜ ๊ธฐ์žฌ๋œ ์˜์กด์„ฑ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๋ฉ”์ด๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์„ค์น˜ํ•ด์ฃผ์„ธ์š”.