Approximate TF-IDF based on topic extraction from massive message stream using the GPU