TeamMsgExtractor/msg-extractor

Project dependencies may have API risk issues

PyDeps opened this issue · 3 comments

Hi, In msg-extractor, inappropriate dependency versioning constraints can cause risks.

Below are the dependencies and version constraints that the project is using

imapclient>=2.1.0
olefile>=0.46
tzlocal>=4.2
compressed_rtf>=1.0.6
ebcdic>=1.1.1
beautifulsoup4>=4.11.1
RTFDE>=0.0.2
chardet>=4.0.0

The version constraint == will introduce the risk of dependency conflicts because the scope of dependencies is too strict.
The version constraint No Upper Bound and * will introduce the risk of the missing API Error because the latest version of the dependencies may remove some APIs.

After further analysis, in this project,
The version constraint of dependency olefile can be changed to ==0.44.
The version constraint of dependency olefile can be changed to >=0.44,<=0.44.
The version constraint of dependency beautifulsoup4 can be changed to >=4.10.0,<=4.11.1.

The above modification suggestions can reduce the dependency conflicts as much as possible,
and introduce the latest version as much as possible without calling Error in the projects.

The invocation of the current project includes all the following methods.

The calling methods from the olefile
olefile.olefile.filetime2datetime
olefile.OleFileIO
olefile.isOleFile
The calling methods from the beautifulsoup4
bs4.Tag.get
bs4.BeautifulSoup.new_tag
bs4.BeautifulSoup
The calling methods from the all methods
exceptions.UnknownCodepageError
enums.ElectronicAddressProperties
x.decode
self.__eq__
option.split
self.__uint16_t.unpack
node.get_content_charset
properHex
foundStreams.append
filetimeToUtc
PersonalDistributionListEntryID
constants.PROPERTIES.get
self.__ole.exists
self.__props.items
os.getenv.decode
enumerate
json.load
AttributeError
self.body.html.escpae.replace.replace.encode
bs4.BeautifulSoup.findAll
os.path.splitext
dev_classes.attachment.Attachment
self.__int16_t.unpack
setupDevLogger
constants.RE_RTF_BODY_FALLBACK_FS.sub
self.injectRtfHeader
self.__prefixList.len.dir_.startswith
ord
value.replace.replace
VariableLengthProp
MessageEntryID
msg.save
_struct.unpack
constants.RE_BIN.search.group
dir_.endswith
NumericalNamedProperty
logging.critical
zipfile.ZipFile.namelist
parser.add_mutually_exclusive_group.add_argument
f.write
super.__init__
constants.STMI64.unpack
enums.BCImageSource
html.escpae
olefile.olefile.filetime2datetime
constants.STFIX.unpack
injectableHeader.encode
PIL.ImageDraw.ImageDraw.rectangle
subprocess.Popen
openMsg
other.pack
all
meeting_forward.MeetingForwardNotification
_helpers.BytesReader.readAnsiString
logging.config.dictConfig
extract_msg.utils.openMsg
_helpers.BytesReader.readUnsignedByte
self.listdir
utils.divide
entry.properHex.upper
enums.OORBodyFormat
os.makedirs
enums.MeetingRecipientType
path.str.replace
cchKeyName.reader.read.decode
compressed_rtf.crc32.crc32
constants.RE_BIN.search.end
os.path.split
enums.MessageType
Exception
self.__propertiesDict.__iter__
hex.rjust
self.openstream
data.encode.as_bytes
utils.bitwiseAdjustedAnd
logging.getLogger.warning
ImportError
extract_msg.dev.main
self.pack
parser.find.insert
self.exists
msg.listDir
self.__int8_t.unpack
self.body.html.escape.replace.replace.encode
json.dump
kwargs.get
inputTostring
stream.read
enums.NamedPropertyType
joinStr.join
email.parser.Parser.parsestr.add_header
toProcess.popleft.get_payload
self.body.html.escape.replace.replace
RTFDE.DeEncapsulator
join.hex
f.read
self.__prefix.split.split
self._recipients.append
wkOptions.append
instance.sExists
RuntimeError
traceback.format_exc
_helpers.BytesReader.readShort
os.path.join
self.getJson
kwargs.get.pathlib.Path.absolute
utils.htmlSanitize
validateAttachment
self.attachmentClass
self.props.get
os.access
sys.exit
body.replace.decode
getStringDetails
self.classType.lower
json.dumps
self.__msg.exists
utils.filetimeToDatetime
self.unpack
_helpers.BytesReader.read
bs.find.insert
self.parseType
width.self.readByteString.decode
join.replace
id.upper.upper
instance._getStream
utils.prepareFilename
correctedHtml.find.append
process.communicate.decode
WrappedEntryID
recipientDirs.append
logging.getLogger.addHandler
prop.createProp
getStreamDetails
codecs.lookup
self.getSavePdfBody
exceptions.WKError
collections.deque.extend
bs4.BeautifulSoup.new_tag
base64.b64encode
y.endswith
bs4.Tag.get
body.replace.replace
post.Post
argparse.ArgumentParser.add_mutually_exclusive_group
collections.deque
int
constants.ST_TZ.unpack
olefile.OleFileIO
email.message_from_bytes
self.getFilename
exceptions.TZError
enums.BCTextFormat
compressed_rtf.decompress
enums.RecurFrequency
utils.properHex
self.readByteString
set.add
super
chardet.detect
utils.rtfSanitizeHtml
setuptools.setup
kwargs.get.escape
tuple
collections.deque.popleft
self.getSaveRtfBody
self.__msg._getTypedStream
x.find
main
ceilDiv
bin
constants.STI32.unpack
instance.listDir
self.saveEmbededMessage
propertyID.upper.upper
propertyName.upper.upper
subprocess.Popen.communicate
self.saveRaw
instance.props.has_key
stream.read.decode
random.choice
self.__streamSource._getTypedData
line.strip
tag.get.startswith
min
join
constants.STMI32.unpack
utils.roundUp
testName.str.replace
os.path.isfile
_helpers.BytesReader
logging.addLevelName
self._deencapsultor.deencapsulate
_zip.namelist
utils.findWk
olefile.isOleFile
guidVals.hex.upper
zfile.open
self.__int64_t.unpack
os.getcwd
self.__props.keys
recipient.Recipient
email.parser.Parser
correctedHtml.find.insert_before
self._getStringStream
self._getTypedProperty
hex
utils.windowsUnicode
parser.new_tag.append
constants.RE_BIN.search.start
getEmailDetails
os.path.expanduser
_class
strSanitize
constants.ST_BC_HEAD.unpack
fullFilename.exists
decode_utf7
files.append
self.body.html.escape.replace
isinstance
constants.STMF64.unpack
getattr
self._ensureSetNamed
self.__props.get
self.recipientSeparator.join
logging.getLogger.warn
self.getSaveHtmlBody
toProcess.popleft.get_content_maintype
exceptions.UnsupportedMSGTypeError
self.read
constants.ST1.unpack
dict
self.__prefix.split
self.__props.__getitem__
msg._getStream
ErrorType
constants.STUI32.pack
constants.ST_GUID.unpack.hex
self.classType.lower.split
pprint.pprint
bs4.BeautifulSoup.append
prefix.split.pop
os.chdir
constants.RE_INVALID_FILENAME_CHARACTERS.search
magic.from_buffer
_helpers.BytesReader.readByteString
self.openstream.read
embedded.append
extract_msg.utils.setupLogging
fromTimeStamp
self.__ole.listdir
enums.MacintoshEncoding
msg.MSGFile.close
instance.exists
self._getStream.decode
knownMsgClass
testName.exists
logging.getLogger.error
UnsupportedEncodingError
properties.Properties
self.get.__format__
NNTPNewsgroupFolderEntryID
msg.classType.lower.startswith
tzlocal.get_localzone
node.get_content
length.a.rjust.upper
self._getStream
self.regenerateRandomName
MessageID
enums.PropertiesType
constants.STF64.unpack
named.Named
constants.STMF32.unpack
data.decode
value.replace.replace.replace
char.encode
TypeError
inputToBytes
self.tryReadBytes
self.getSaveBody
constants.RE_RTF_BODY_FALLBACK_F.sub
enums.AttachErrorBehavior
open
self.__properties.append
logging.getLogger
exceptions.IncompatibleOptionsError
constants.STVAR.unpack
os.path.exists
self._attachments.append
appointment.AppointmentMeeting
bool
logging.getLogger.log
self.taskDateCompleted.__format__
datetime.datetime
attachment.BrokenAttachment
struct.Struct
nameLength.pos.pos.namesStream.decode
newDirName.str.rstrip
datetime.timedelta
extract_msg.validation.validate
_helpers.BytesReader.assertRead
OneOffRecipient
utils.validateHtml
self.__msg._getStringStream
self.startDate.__format__
any
msg.classType.lower
exceptions.DataNotFoundError
x.inp.ord.hex.rjust
constants.ST_LE_UI64.unpack
logging.warning
AddressBookEntryID
datetime.datetime.now.timetuple
option.startswith
constants.STNP_NAM.unpack
self.__propertiesDict.__len__
self.__props.__contains__
utils.parseType
constants.ST_GUID.unpack
self.fixPath
stringFE
re.compile
name.lower.encode
enums.BCTemplateID
enums.RecurMonthNthWeek
constants.ST3.unpack
enums.BCImageAlignment
FieldInfo
enums.ContactAddressIndex
self._getStringStream.encode
path.str.replace.rstrip
bs4.BeautifulSoup
NotImplementedError
self.namedProperties.get
self._getTypedStream
self.__uint64_t.unpack
structures.misc_id.ServerID
self.__propertiesDict.keys
validateMsg
self._ensureSetProperty
bs4.Tag.insert
constants.ST_BC_FIELD_INFO.unpack
constants.RE_BIN.search
FixedLengthProp
constants.STPEID.unpack
self.listDir
line.strip.split
attachment.save
mask.bin.index
self.deencapsulatedRtf.html.encode
parser.find.insert_before
utils.divide.append
utils.getFullClassName
functools.partial
self.__uint32_t.unpack
_helpers.BytesReader.readUnsignedShort
struct.unpack
list
self.data.save
constants.STMI16.unpack
value.replace.find
exceptions.InvaildPropertyIdError
_helpers.BytesReader.readUnsignedLong
constants.STUI16.unpack
pathlib.Path
data.encode.encode
entry_id.FolderEntryID
self.__double_t.unpack
self.endDate.__format__
sum
self.__uint8_t.unpack
bytesInputVar.decode
msg.exists
formattedProps.append
logging.getLogger.setLevel
enums.RecurDOW
system_time.SystemTime
zipfile.ZipFile.close
StringNamedProperty
utils.inputToMsgpath
enums.DeencapType
shutil.which
time.time
hasattr
x.group
enums.RecurEndType.fromInt
classType.lower.lower
self.props.has_key
task_request.TaskRequest
datetime.datetime.fromtimestamp
utils.bytesToGuid
bs4.BeautifulSoup.prettify
utils.inputToString
entry_id.StoreObjectEntryID
self.__prefixLen.item.startswith
msg.classType.lower.lower
stream.read.decode.replace
message.Message
self.deencapsulateBody
dirName.with_name
bs4.Tag
meeting_request.MeetingRequest
format
version_re.search.groupdict
attachments.append
filetimeToDatetime
utils.setupLogging
constants.RE_HTML_BODY_START.sub
mp.as_bytes
self.__props.__repr__
_helpers.BytesReader.assertNull
utils.createZipOpen
set
argparse.ArgumentParser
re.compile.finditer
dev_classes.Message
task.Task
self.__props.values
self.__ole.__enter__
_helpers.BytesReader.readInt
enums.ResponseStatus
exceptions.UnrecognizedMSGTypeError
formatter.suffix.joinStr.prefix.self.getInjectableHeader.encode
exceptions.StandardViolationError
enums.MessageFormat
constants.STF32.unpack
formatter
self._ensureSetTyped
configPath.exists
data.properHex.upper
constants.RE_RTF_BODY_FALLBACK_PLAIN.sub
self._headerDict.pop
data.base64.b64encode.decode
utils.inputToMsgpath.append
tag.name.lower
self.tell
constants.HEADER_FORMAT.format
formattedProps.pop
input
constants.STNP_ENT.unpack
utils.inputToBytes
self.seek
next
constants.ST_SYSTEMTIME.pack
overrideClass
logging.getLogger.exception
self.__ole.close
x.endswith
_helpers.BytesReader.readByte
self.__ole.openstream
zipfile.ZipInfo
self.__msg._getStream
meeting_response.MeetingResponse
FolderID
re.search
setattr
self.birthday.__format__
enums.AddressBookType
utils.hasLen
utils.getEncodingName
msg.__class__
self.__msg.sExists
dataNodes.append
zipfile.ZipFile
self.__deencap
email.parser.Parser.parsestr
spaces.group
self.fix_path
node.get_filename
self._getTypedData
self.body.html.escpae.replace.replace
self.__msg.existsTypedProperty
ContactAddressEntryID
strListToStr
self.body.html.escpae.replace
self.weddingAnniversary.__format__
self.htmlInjectableHeader.encode
utils.addNumToZipDir
ValueError
len
self.get
utils.verifyPropertyId
dependencies.append
_open
meeting_cancellation.MeetingCancellation
self.slistDir
constants.RE_RTF_ENC_BODY_UGLY.sub
utils.verifyType
constants.RE_RTF_BODY_START.sub
BadHtmlError
utils.openMsg
self.__signedAttachmentClass
unpacked.extraInfo.BytesReader.readUtf16String
enums.ResponseType
_id.upper.upper
inp.replace.replace
PIL.ImageDraw.ImageDraw
attachmentDirs.append
isinstance.encode
bytesToGuid
datetime.datetime.now
bs4.BeautifulSoup.find
message_signed.MessageSigned
ExecutableNotFound
argparse.ArgumentParser.parse_args
utils.addNumToDir
enums.RecipientType
print
msgFiles.append
fullFilename.str.replace
os.path.expandvars
FileExistsError
meeting_exception.MeetingException
PIL.Image.new
range
email.message_from_string
json.loads
self.__props.__iter__
self._ensureSet
func
copy.deepcopy
constants.STUI32.unpack
self.__prefix.split.pop
html.escape
correctedHtml.find.insert_after
enums.PostalAddressID
os.getenv
warnings.warn
enums.ErrorCodeType
str
self.__float_t.unpack
key.upper
self.__recipientSeparator.join
constants.STUI64.unpack
inp.inputToString.replace
constants.RE_HTML_SAN_SPACE.sub
repr
attachment.data.close
inp.inputToString.replace.split
StoreObjectEntryID
msg.MSGFile
msg.saveAttachments
x.startswith
_helpers.BytesReader.readClass
structures.entry_id.EntryID.autoCreate
x.isascii
logging.NullHandler
bodyMarker.group
self.close
exceptions.InvalidFileFormatError
prefixLen.dir_.startswith
constants.STI16.unpack
named.NamedProperties
FolderEntryID
enums.BCLabelFormat
exceptions.UnknownTypeError
stringInputVar.encode
email.utils.parsedate
attachment.UnsupportedAttachment
cls
entries.append
imapclient.imapclient.decode_utf7
enums.RecurCalendarType
entry_id.MessageEntryID
exceptions.ConversionError
copy.copy
errorMsg.format
contact.Contact
instance._getStringStream
self.has_key
_helpers.BytesReader.readUnsignedInt
enums.DisplayType
self.__propertiesDict.values
node.get_content_type
msg.classType.lower.endswith
type
constants.STI64.unpack
inputToString
extract_msg.utils.getCommandArgs
enums.EntryIDType
logging.basicConfig
argparse.ArgumentParser.add_argument
constants.ST2.unpack
utils.isEncapsulatedRtf
constants.ST_SYSTEMTIME.unpack
self.getInjectableHeader
constants.ST_BE_UI16.unpack
out.append
self.__prefix.split.replace
self._genRecipient
self.__int32_t.unpack
sorted
utils.rtfSanitizePlain
utils.unwrapMultipart
RPTSW.fromBits
isinstance.lower
pathlib.Path.exists
glob.glob
enums.TZFlag.fromBits
re.compile.search
self.headerInit
self.injectHtmlHeader
collections.deque.append
utils.msgpathToString
_helpers.BytesReader.readUtf16String
constants.RE_RTF_ENC_BODY_START_1.sub
enums.RecurPatternType
self._readDecodedString
parser.find.insert_after
logging.getLogger.debug
validateRecipient
IOError
logging.getLogger.info

@developer
Could please help me check this issue?
May I pull a request to fix it?
Thank you very much.

What I've really wanted to do is release a new version and better specify what versions of modules definitely work, including valid lower bounds (so that modules using lower version constraints will be compatible), and look at the listed versioning info for other packages to determine what the correct upperbound should be. I don't really have the time at the moment to actually look into the specific bounds, but I agree they should all probably have upper bounds on them. Any that properly support semantic versioning and have a version greater than or equal to 1.0.0 should not have changes that will break the api, and so could be set to have the upper bound less than the next major version rather than the next minor version. Of course, nothing is perfect, so meh.

However, I really do need to properly have the bounds be considered to ensure the module doesn't have security flaws that could be exploited through the dependencies, as they are interacting with email files. I would say olefile and beautifulsoup4 are the most likely places for such.

One thing that confuses me is you are suggesting to go to a lower version of olefile (==0.44 instead ==0.46). Can you explain?

The last large block in your message I frankly don't know what to do with.

If you would like to go ahead and try to do that, be my guest

I believe this issue to be taken care of in version 0.38.0. All requirements use == or have upperbounds, usually the ones I could confirm semantic versioning on. If I couldn't confirm any reasonable semantic versioning, I left them as a hard == with the intent to check back on them from time to time.

Closing due to inactivity