java 字符串词频统计实例代码

2016-02-19 10:51 28 1 收藏

想要天天向上,就要懂得享受学习。图老师为大家推荐java 字符串词频统计实例代码,精彩的内容需要你们用心的阅读。还在等什么快点来看看吧!

【 tulaoshi.com - 编程语言 】

代码如下:

package com.gpdi.action;

import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class WordsStatistics {

    class Obj {
        int count ;
        Obj(int count){
            this.count = count;
        }
    }

    public ListWordCount statistics(String word) {
        ListWordCount rs = new ArrayListWordCount();
        Map String,Obj map = new HashMapString,Obj();

        if(word == null ) {
            return null;
        }
        word = word.toLowerCase();
        word = word.replaceAll("'s", "");
        word = word.replaceAll(",", "");
        word = word.replaceAll("-", "");
        word = word.replaceAll(".", "");
        word = word.replaceAll("'", "");
        word = word.replaceAll(":", "");
        word = word.replaceAll("!", "");
        word = word.replaceAll("n", "");

        String [] wordArray = word.split(" ");
        for(String simpleWord : wordArray) {
            simpleWord = simpleWord.trim(); 
            if (simpleWord != null && !simpleWord.equalsIgnoreCase("")) {
                Obj cnt = map.get(simpleWord);
                if ( cnt!= null ) {
                    cnt.count++;
                }else {
                    map.put(simpleWord, new Obj(1));
                }
            }
        }

        for(String key : map.keySet()) {
            WordCount wd = new WordCount(key,map.get(key).count);
            rs.add(wd);
        }

        Collections.sort(rs, new java.util.ComparatorWordCount(){
            @Override
            public int compare(WordCount o1, WordCount o2) {
                int result = 0 ;
                if (o1.getCount() o2.getCount() ) {
                    result = -1;
                }else if (o1.getCount() o2.getCount()) {
                    result = 1;
                }else {
                    int strRs = o1.getWord().compareToIgnoreCase(o2.getWord());
                    if ( strRs 0 ) {
                        result = 1;
                    }else {
                        result = -1 ;
                    }
                }
                return result;
            }

        });
        return rs;
    }

     
    public static void main(String args[]) {
        String word = "Pinterest is might be aa ab aa ab marketer's dream  - ths site is largely used to curate products " ;
        WordsStatistics s = new WordsStatistics();
        ListWordCount rs = s.statistics(word);
        for(WordCount word1 : rs) {
            System.out.println(word1.getWord()+"*"+word1.getCount());
        }
    }

}

(本文来源于图老师网站,更多请访问https://www.tulaoshi.com/bianchengyuyan/)

(本文来源于图老师网站,更多请访问https://www.tulaoshi.com/bianchengyuyan/)

来源:https://www.tulaoshi.com/n/20160219/1595937.html

延伸阅读
/** * 字符串分割 * * @author * @param str java.lang.String 要分割的字符串 * @param sp java.lang.String 需要被替换的子串 * @return 替换之后的字符串 * @return 分割失败,返回null */ public static String[] Split(String str, String sp) { StringTokenizer st = new StringTokenizer...
标签: Web开发
比如 1223445677777778aabbcccccccccc 经过过滤之后就是12345678abc 代码如下: %     '过滤重复    Function norepeat(Str)     Dim RegEx     If IsNull(Str) Or Str="" Then Exit Function  ...
标签: Web开发
比如 1223445677777778aabbcccccccccc 经过过滤之后就是12345678abc 代码如下: % '过滤重复 Function norepeat(Str) Dim RegEx If IsNull(Str) Or Str="" Then Exit Function Set RegEx=New RegExp RegEx.Global = True RegEx.IgnoreCase=True RegEx.MultiLine = True RegEx.pattern="(.)\1+" str=regEx.replace(str,"$1") Set R...
在string方法中,如ToUpper等字符串操作方法,都会产生一个新的字符串,这样增大了运行开支。一个替代方案是通过非托管代码直接操作字符串。如替代ToUpper方法:using System; public class Test{ public static void Main(string[] args) { string str = "hello"; ToUpper(str); Console.WriteLine(str); } private static uns...
toHexString public static String toHexString(int i)以十六进制的无符号整数形式返回一个整数参数的字符串表示形式。 如果参数为负,那么无符号整数值为参数加上 232;否则等于该参数。将该值转换为十六进制(基数 16)的无前导 0 的 ASCII 数字字符串。如果无符号数的大小值为零,则用一个零字符 '0' ('\u0030') 表示它;否则,无符号数大...

经验教程

341

收藏

49
微博分享 QQ分享 QQ空间 手机页面 收藏网站 回到头部