2013-08-24 3 views
1

mime4j를 사용하여 이메일을 구문 분석 할 때 문제가 발생했습니다. 이메일에 첨부 파일이 있으며 MimeStreamParser를 사용하여 구문 분석합니다. 파서는 startMultipart 메서드를 전혀 호출하지 않습니다. 대신 body 메서드를 한 번만 호출하고 BodyDescriptor는 "text/plain"입니다.mime4j를 사용하여 이메일 구문 분석

이 문제의 근본 원인 인 전자 메일 형식이나 프로그램을 모르십니까?

다음
import java.io.FileInputStream; 
import java.io.FileNotFoundException; 
import java.io.IOException; 
import java.io.InputStream; 

import org.apache.james.mime4j.*; 
import org.apache.james.mime4j.dom.BinaryBody; 
import org.apache.james.mime4j.dom.Body; 
import org.apache.james.mime4j.dom.Entity; 
import org.apache.james.mime4j.dom.Header; 
import org.apache.james.mime4j.dom.Message; 
import org.apache.james.mime4j.dom.MessageBuilder; 
import org.apache.james.mime4j.dom.Multipart; 
import org.apache.james.mime4j.dom.TextBody; 
import org.apache.james.mime4j.dom.address.Mailbox; 
import org.apache.james.mime4j.dom.address.MailboxList; 
import org.apache.james.mime4j.dom.field.AddressListField; 
import org.apache.james.mime4j.dom.field.ContentTypeField; 
import org.apache.james.mime4j.dom.field.DateTimeField; 
import org.apache.james.mime4j.dom.field.UnstructuredField; 
import org.apache.james.mime4j.field.address.AddressFormatter; 
import org.apache.james.mime4j.message.BodyPart; 
import org.apache.james.mime4j.message.MessageImpl; 
import org.apache.james.mime4j.message.DefaultMessageBuilder; 
import org.apache.james.mime4j.message.SimpleContentHandler; 
import org.apache.james.mime4j.parser.ContentHandler; 
import org.apache.james.mime4j.parser.MimeStreamParser; 
import org.apache.james.mime4j.stream.BodyDescriptor; 
import org.apache.james.mime4j.stream.Field; 
import org.apache.james.mime4j.stream.MimeConfig; 

public class TestClass extends SimpleContentHandler{ 

    public static void main(String[] args) throws MimeException, IOException { 
     ContentHandler handler = new TestClass(); 
      MimeConfig config = new MimeConfig(); 
      MimeStreamParser parser = new MimeStreamParser(config); 
      parser.setContentHandler(handler); 
      InputStream instream = new FileInputStream("mail/testuser1"); 
      try { 
       parser.parse(instream); 
      } finally { 
       instream.close(); 
      } 
    } 

    @Override 
    public void headers(Header arg0) { 
     // TODO Auto-generated method stub 
     System.out.println("headers args: "+arg0); 
    } 

    @Override 
    public void body(BodyDescriptor bd, InputStream is) { 
     // TODO Auto-generated method stub 
     System.out.println("body descriptor: "+bd); 
    } 

    public void startMessage(){ 

     System.out.println("startMessage"); 
    } 


    public void endMessage(){ 

     System.out.println("endMessage"); 
    } 


    public void startBodyPart(){ 

     System.out.println("startBodyPart"); 
    } 


    public void endBodyPart() { 

     System.out.println("endBodyPart"); 
    } 





    public void preamble(InputStream is){ 

     System.out.println("preamble"); 
    } 


    public void epilogue(InputStream is) { 

     System.out.println("epilogue"); 
    } 


    public void startMultipart(BodyDescriptor bd){ 

     System.out.println("startMultipart"); 
    } 


    public void endMultipart() { 

     System.out.println("endMultipart"); 
    } 


    public void raw(InputStream is) { 

     System.out.println("raw"); 
    } 


} 

내 이메일 파일의 일부입니다 :

From MAILER_DAEMON Wed Aug 21 19:24:53 2013 
Date: Wed, 21 Aug 2013 19:24:53 +0800 
From: Mail System Internal Data <[email protected]> 
Subject: DON'T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA 
Message-ID: <[email protected]> 
X-IMAP: 1377072167 0000000003 
Status: RO 

This text is part of the internal format of your mail folder, and is not 
a real message. It is created automatically by the mail system software. 
If deleted, important folder data will be lost, and it will be re-created 
with the data reset to initial values. 

From [email protected] Sat Aug 24 10:53:42 2013 
Return-Path: <[email protected]> 
X-Original-To: [email protected] 
Delivered-To: [email protected] 
Received: from shupc (unknown [192.168.75.130]) 
by mail.abc.com (Postfix) with SMTP id C0F5B1EFBC3 
for <[email protected]>; Sat, 24 Aug 2013 10:53:42 +0800 (CST) 
Message-ID: <[email protected]> 
From: "john" <[email protected]> 
To: "smith" <[email protected]> 
Subject: aaa 
Date: Sat, 24 Aug 2013 10:53:42 +0800 
MIME-Version: 1.0 
Content-Type: multipart/mixed; 
boundary="----=_NextPart_000_000B_01CEA0B8.32903020" 
X-Priority: 3 
X-MSMail-Priority: Normal 
X-Mailer: Microsoft Outlook Express 6.00.2900.5512 
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512 
X-UID: 3             
Status: O 
Content-Length: 386430 

This is a multi-part message in MIME format. 

------=_NextPart_000_000B_01CEA0B8.32903020 
Content-Type: multipart/alternative; 
boundary="----=_NextPart_001_000C_01CEA0B8.32903020" 


------=_NextPart_001_000C_01CEA0B8.32903020 
Content-Type: text/plain; 
charset="gb2312" 
Content-Transfer-Encoding: base64 

dGVzdCBhYSBiYiBjYw== 

------=_NextPart_001_000C_01CEA0B8.32903020 
Content-Type: text/html; 
    charset="gb2312" 
Content-Transfer-Encoding: base64 

PCFET0NUWVBFIEhUTUwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDQuMCBUcmFuc2l0aW9uYWwv 
L0VOIj4NCjxIVE1MPjxIRUFEPg0KPE1FVEEgaHR0cC1lcXVpdj1Db250ZW50LVR5cGUgY29udGVu 
dD0idGV4dC9odG1sOyBjaGFyc2V0PWdiMjMxMiI+DQo8TUVUQSBjb250ZW50PSJNU0hUTUwgNi4w 
MC4yOTAwLjU1MTIiIG5hbWU9R0VORVJBVE9SPg0KPFNUWUxFPjwvU1RZTEU+DQo8L0hFQUQ+DQo8 
Qk9EWSBiZ0NvbG9yPSNmZmZmZmY+DQo8RElWPjxGT05UIHNpemU9Mj50ZXN0IGFhIGJiIGNjPC9G 
T05UPjwvRElWPjwvQk9EWT48L0hUTUw+DQo= 

------=_NextPart_001_000C_01CEA0B8.32903020-- 

------=_NextPart_000_000B_01CEA0B8.32903020 
Content-Type: application/octet-stream; 
    name="10112716229607.doc" 
Content-Transfer-Encoding: base64 
Content-Disposition: attachment; 
    filename="10112716229607.doc" 

0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAAAAAAAAFAAAAKAIAAAAAAAAA 
EAAAKgIAAAEAAAD+////AAAAACMCAAAkAgAAJQIAACYCAAAnAgAA//////////////////////// 
//////////////////////////////////////////////////////////////////////////// 
//////////////////////////////////////////////////////////////////////////// 
//////////////////////////////////////////////////////////////////////////// 
//////////////////////////////////////////////////////////////////////////// 
//////////////////////////////////////////////////////////////////////////// 
//////////////////////////////////////////////////////////////////////////// 
///////////////////////////////////////////////////////////////////////////s 
pcEAcWAJBAAA8FK/AAAAAAAAEAAAAAAABgAArJ0CAA4AYmpianFQcVAAAAAAAAAAAAAAAAAAAAAA 
AAAECBYAOBIDABM6AQATOgEA1gwBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD//w8AAAAA 
AAAAAAD//w8AAAAAAAAAAAD//w 

답변

1

문제 여러 부분이 아닌 샘플 이메일 함께

여기 내 테스트 프로그램입니다. 인라인 텍스트로 여러 부분 이메일을 포함합니다.

첫 번째 헤더 ("FROM MAILER")를 제거한 다음 Content-Type 다음에 오는 모든 행 (예 : charset 및 boundary)을 spec (RFC822 이상) 또는 공백 문자 줄 바꿈을 제거하십시오.

Content-Type: multipart/mixed; 
boundary="----=_NextPart_000_000B_01CEA0B8.32903020" 

중 하나에 :

Content-Type: multipart/mixed; 
    boundary="----=_NextPart_000_000B_01CEA0B8.32903020" 

나 :에서

변경 :

또는
Content-Type: multipart/mixed; boundary="----=_NextPart_000_000B_01CEA0B8.32903020" 

는 다른 메시지를보십시오 예를 참조하십시오.

-1

사용 mime4j에 따라 다음 라이브러리 : email-mime-parser

제공되는 샘플 코드는 이메일 구문 분석 처리가 수행하고 결과 '이메일'개체가 문제의 해결책 편리한 방법을 제공 :

ContentHandler contentHandler = new CustomContentHandler(); 

MimeConfig mime4jParserConfig = new MimeConfig(); 
BodyDescriptorBuilder bodyDescriptorBuilder = new DefaultBodyDescriptorBuilder(); 
MimeStreamParser mime4jParser = new MimeStreamParser(mime4jParserConfig,DecodeMonitor.SILENT,bodyDescriptorBuilder); 
mime4jParser.setContentDecoding(true); 
mime4jParser.setContentHandler(contentHandler); 


InputStream mailIn = 'Provide email mime stream here'; 
mime4jParser.parse(mailIn); 

Email email = ((CustomContentHandler) contentHandler).getEmail(); 
+1

이것은 이론적으로 질문에 대답 할 수 있지만 여기에 대답의 핵심 부분을 포함하고 참조 용 링크를 제공하는 것이 바람직합니다 (// meta.stackoverflow.com/q/8259). –

관련 문제