Python and COM
Greg Stein
Tutorial Agenda
• Introduction to COM
• PythonCOM Framework
• Using Python as a COM client
• Using Python as a COM server
• Advanced topics
COM/OLE and ActiveX
• Component Object Model
– Specification for implementing and defining objects
• OLE is the old name, COM is the official
name, ActiveX is the marketing name
What is a COM interface?
• A technique for implementing objects
– Uses a “vtable” to define methods– Does not have properties
• Basically a C++ object with only public
methods
– But not C++ specific - just borrowed implementation technique
IUnknown
• Base class of all interfaces
– Every COM object must implement
• Defines object lifetimes
– Reference counted using “AddRef” and “Release” methods
• Defines techniques for querying the object for
something useful
– QueryInterface method
Custom interfaces
• Any interface derived from IUnknown
– Therefore must follow IUnknown rules for lifetimes
• Derived?
– Means object can be used as its base. Simple implementation using vtables
• All interfaces are custom interfaces
Objects vs. Interfaces
• Interfaces simply define functionality
• Objects, once instantiated, implement the
interface
– Object class also has unique ID
• Objects always provide multiple interfaces
CLSIDs, GUIDs, UUIDs, IIDs
• COM defines 128 bit identifier, and API for creating them
– “High degree of certainty” that these are globally unique
• All use the same type and implementation, acronym reflects different usage
– IID == Interface ID, GUID == Globally Unique Identifier, CLSID == Class ID, UUID ==
Universally Unique Identifier, etc
Registering Objects
• Objects register themselves in the Windows
Registry
– Register with their Unique CLSID
• Also register a name for the object
– COM provides API for translating, but names are not guaranteed unique
Creating Objects
• Standard COM API for creation
– CoCreateInstance, passing:
• CLSID identifying the object to create
C++ Pseudo Code
// Create an object, but only get back // IUnknown pointer, with new reference IUnknown *pUnk =
CoCreateInstance(“MyObject”, ...)
// Ask the object for a pointer to a useful implementation!
pUseful =
pUnk->QueryInterface( IID_IUseful, ...)
pUnk->Release(); // finished with this. pUseful->DoSomethingUseful();
…
Custom Interface Example
1 of 2
• Native Interfaces using Word
– You would be unlikely to use Word this way – Demonstrative purposes only!
• >>> import ni, win32com, pythoncom >>> o=pythoncom.CoCreateInstance \ ("Word.Application", None,
pythoncom.CLSCTX_ALL, pythoncom.IID_IUnknown) >>> o
Custom Interface Example
2 of 2
• >>> o.QueryInterface( \
pythoncom.IID_IPersist)
<PyIPersist at 0x85dd54 with obj at 0x423464>
• Almost identical to the pseudo code above
– In fact, Python is far better than C++, as long aswe support the required interfaces natively
• No AddRef or Release required, or even
exposed
IDispatch - poor man’s COM
1 of 2
• Also known as “Automation”
• Derived from IUnknown
• Defines vtable methods to determine
“dispatch methods and properties” at runtime
– Perfect for scripting languages which have nocompile step, or which are not C++!
• Optionally uses Type Libraries so
IDispatch - poor man’s COM
2 of 2
• What many people know as COM
– Microsoft marketing machine
– In reality, a small, but somewhat useful part of COM
• Many useful COM interfaces do not support
IDispatch
– Native MAPI, Active Scripting/Debugging, ActiveX Controls, Speech Recognition, etc
Core Interfaces
Introduction
• COM tends to use interfaces for everything.
Example:
– Instead of using a file pointer/handle, a
“Stream” interface is used, which provides file like semantics
– Anyone free to implement the stream interface using any technique they choose
Core Interfaces
Enumerators
• Enumerators provide access into a list of
values
• Provides Next, Skip, Reset and Clone
methods
• Different enumerator interfaces for different
types:
Core Interfaces
Collections
• Alternative technique for accessing lists
• Usually only used via IDispatch
– Uses “tricks” only IDispatch has available, such as properties!
– Therefore not a real interface
• Used to provide array like semantics for VB,
etc
Core Interfaces
Streams and Storage
• IStream provides file like semantics
• IStorage provides file system like semantics
• Programs can write to this specification
without needing to know the destination of
the data
Core Interfaces
Monikers
• Provide file name to object mapping semantics
• Fundamental concept is to provide an
indirection level to an underlying object, and a
program neutral way of accessing the
underlying object
– File and URL Monikers do just that
– Pointer monikers allow anyone to implement an abstract indirection to an object (e.g., into a
Core Interfaces
ConnectionPoints
• Provides simple callback functionality
• Client sets up connection point object
• Object passed to Connection Point Container
object
• Container calls methods on the Connection
Point when appropriate
Core Interfaces
And the Rest
• Plenty of others not listed here
• Anything in core PythonCOM is considered
core
– By us, anyway - YMMV :-)
• Check out the sources, Help Files, or
forthcoming documentation
– Who was going to write that?
Error Handling
• All methods use HRESULT return code
– Multiple “success” codes, and many failure codes
– ISupportErrorInfo interface for richer error information
PythonCOM Framework
• Supports use of Python for both COM
servers and COM clients
• Easy for the Python programmer
• Dispatch friendly with core support for
most common vtable interfaces
PythonCOM Extensions
• Model allows for COM extension DLLs
– Once loaded, looks like native support to the Python programmer
• MAPI, ActiveX Scripting and Debugging all
use this technique
– Import them once, and PythonCOM will serve up their interfaces
Python COM Clients
The Problem
• Calling a COM object from Python
• COM = vtable = C++ (not Python)
• IDispatch removes vtable requirement
– Imposes coding burden on client
Python COM Clients
The Answer
• We need an intermediary between a Python
object and COM’s vtables
PythonCOM Interfaces
1 of 3
• Very similar to standard Python extension
modules
• Conceptually identical to wrapping any C++
object in Python
– 1:1 mapping between the COM pointer and Python object
– Pulls apart arguments using PyArg_ParseTuple – Makes call on underlying pointer
PythonCOM Interfaces
2 of 3
PythonCOM Interfaces
3 of 3
Interface IDispatch ID is pa tc h in te rf ac e C++ Python
Client Wrapper Server
IDispatch vs. vtable
• IDispatch implemented in PythonCOM.dll
like any other interface
– No Dynamic logic implemented in DLL – Only GetIDsOfNames and Invoke exposed
• win32com.client Python code implements
all IDispatch logic
IDispatch Implementation
• 2 modes of IDispatch usage
– Dynamic, where no information about an object is known at runtime
• All determination of methods and properties made at runtime
– Static, where full information about an object is known before hand
• Information comes from a “Type Library”
Dynamic IDispatch Implementation
1 of 5
• Implemented by win32com.client.dynamic
– Also makes use of win32com.client.build
Dynamic IDispatch Implementation
2 of 5
• Not perfect solution as
– __getattr__ has no idea if the attribute being requested is a property reference or a method reference
– No idea if the result of a method call is required (i.e., is it a sub or a function)
– Python must guess at the variable types
Dynamic IDispatch Implementation
3 of 5
• win32com.client.Dispatch kicks it all off
• Demo
>>> import ni
>>> from win32com.client import Dispatch >>> w=Dispatch(“Word.Application”)
>>> w.Visible = 1
Dynamic Dispatch Implementation
4 of 5
• Pros
– No setup steps - just works
– Provides quick scripting access to components
• Cons
– Relatively slow
– You need to know the object model of the target. Not self documenting.
Dynamic Dispatch Implementation
5 of 5
• Smart Dispatch vs. Dumb Dispatch
– To overcome some potential problems, Python attempts to use Type Info even for dynamic
objects
– Slows down considerably for certain objects – win32com.client.DumbDispatch provides
alternative implementation which does not attempt to locate type information
Static Dispatch Implementation
1 of 4
• Generates .py file from Type Information
– win32com.client.makepy does this
• Python code then imports this module
• Python knows everything about the object
– No confusion between methods and properties – Byref args handled correctly
Static Dispatch Implementation
2 of 4
• Demo
C:\> cd “\Program Files\Microsoft Office\Office”
C:\> \python\python
\python\win32com\client\makepy.py msword8.olb > \python\msword8.py ...
C:> start python
>>> import msword8 # grind, grind :-) >>> w = msword8.Application()
Static Dispatch Implementation
3 of 4
• Pros
– ByRef args handled correctly
• Result becomes a tuple in that case – All types handled correctly
• Python knows the type required, so doesnt have to guess. More scope to coerce
– Significantly faster
Static Dispatch Implementation
4 of 4
• Cons
– Need to hunt down type library
– Need to enter cryptic command to generate code – No standard place to put generated code
– Compiling code may take ages
• Not real problem, as this is a once only step
– Type library may not be available
Dispatch, VARIANTs and Python
Types
• VARIANT
– COM concept for IDispatch interface
• Just a C union, with a flag for the type, and an API for manipulating and converting
– IDispatch always uses VARIANT objects • In reality, COM is not typeless - most servers
assume a particular type in the variant
Dispatch, VARIANTs and Python
Types
• Python has 2 modes of conversion
– Python type drives VARIANT type • Python knows no better
• Creates VARIANT based on type of Python object – Known type drives VARIANT type
• For static IDispatch, Python often known exactly type required
win32com.client Files
1 of 2
• makepy.py, dynamic.py
– Static and dynamic IDispatch implementations respectively
• build.py
– Utility code used by both modules above
• CLSIDToClass.py
– Manages dictionary of Python classes, mapped by CLSID. Code generated by makepy.py
win32com.client Files
2 of 2
• combrowse.py
– Basic COM browser that requires Pythonwin. Simply double-click on it.
• tlbrowse.py
– Basic Type Library browser that requires Pythonwin
• util.py
– Utiility helpers
• connect.py
Client Side Error Handling
1 of 2
• Client interfaces raise
pythoncom.com_error
exception
• Exception consists of:
– HRESULT
• 32 bit error code, defined by OLE – Error Message
Client Side Error Handling
2 of 2
• COM Exception Tuple
– Tuple of (wcode, AppName, AppMessage, HelpFile, HelpContext, scode), all of which are application
defined
– Exception object itself, or any part of it, may be None
• Arg Error
– Integer containing the argument number that caused the error
SWIG and COM Client Interfaces
1 of 3
• Recent changes to SWIG allow it to
generate client side native interfaces
– ie, any custom interface not based on IDispatch can be generated
• Uses existing SWIG functionality and M.O.
– ie, maintain .i files, and SWIG generates .c files
• Native MAPI support generated this way
SWIG and COM Client Interfaces
2 of 3
• Sample .i from MAPI
#define TABLE_SORT_DESCEND TABLE_SORT_DESCEND HRESULT MAPIInitialize( MAPIINIT_0 *INPUT); HRESULT MAPILogonEx(
ULONG INPUT,
TCHAR *inNullString, ...
SWIG and COM Client Interfaces
3 of 3
• Notes
– #defines are carried into module
– Many functions are completely trivial – SWIG handles input/output params
– Scope for even better integration with COM
• e.g., maybe the first cut at the .i could be generated from the Type Info.
Python COM Servers
The Problem
• Exposing a Python object as a COM object
• COM = vtable = C++ (not Python)
• IDispatch removes vtable requirement
– Imposes coding burden on client
– Some interfaces are defined as a vtable
• Answer: we need an intermediary between
COM’s vtables and a Python object
Gateways
1 of 2
• Gateways act as the intermediary
– Hold reference to the Python object
– Map C++ method calls into Python calls – Map parameters and return values
• A gateway is a C++ object implementing a
particular COM interface
• Gateways are registered with the framework
and instantiated as needed to support
Gateways
2 of 2
• The default gateway supports IDispatch
– All Python COM servers automatically support IDispatch
Gateways
3 of 3
Calling Python Methods
• The Python COM framework defines an
IDispatch-oriented protocol for how the
gateways call into Python:
– _QueryInterface_ : determine support for a particular COM interface
– _GetIDsOfNames_ : look up a dispatch identifier (DISPID) for a given name
Policies
1 of 2
• “Features” of the gateway protocol:
– Non-intuitive for a Python programmer
– Usually requires support structures for the DISPID handling
– Subtleties with some of the parameters and return values
• Result: hard for Python programmers to
write a COM server
Policies
2 of 2
• A “policy” specifies how to implement a
Python COM server
• The policy object maps the gateway
protocol to the given implementation policy
• The default policy is usually sufficient
• Custom policies may be created and used
Instantiation
1 of 3
• The framework calls the CreateInstance
function in the win32com.server.policy
module
– Hard-wired call to CreateInstance, but behavior can easily be hooked through custom policies
• When your COM object is registered, an
additional registry key specifies the creator
function
Instantiation
2 of 3
• The registry key is read by the default
policy and used to instantiate your object
• COM does not provide additional
parameters to your creator function (the
__init__ method)
– Make sure that any parameters have defaults – COM+ will provide this capability
Instantiation
3 of 3
PythonCOM policy.py
CreateInstance(clsid, riid)
clsid, riid
returned
Interface
creates
pUnk
clsid, riid
The Default Policy
• Python server objects (instances) are
annotated with special attributes
– Typically specified as class attributes – Most are optional
• _public_methods_
– A list of strings specifying the methods that clients are allowed to call
A Quick Example
class MyPythonServer:
_public_methods_ = [ ‘SomeMethod’ ] def SomeMethod(self, arg1, arg2):
do_some_work(arg1)
return whatever(arg2)
• Note that the only difference for the Python
programmer is the addition of the
Useful Attributes
• _public_attrs_ : what Python attributes
should be exposed as COM Properties
• _readonly_attrs_ : which of the above
should be considered read-only
Wrapping
1 of 3
• The process of associating a gateway
instance and a policy instance with a
particular Python instance is known as
“wrapping”
• Similarly, retrieving the Python instance is
known as “unwrapping”
Wrapping
2 of 3
Gateway
Policy
Server
Wrapping
3 of 3
• Wrapping an object that will be returned:
from win32com.server import util ...
def method(self, arg): ob = whatever(arg) return util.wrap(ob)
• Unwrapping (of an argument) is rare
– You must know the object is a Python object (and what to do with it once unwrapped)
Error Handling
1 of 3
• COM defines simple result codes with the
HRESULT type and associated constants
• Extended error information includes
description, help file, context, etc
• Returned via EXCEP_INFO structure in
IDispatch or through ISupportErrorInfo
• Framework maps Python exceptions to
Error Handling
2 of 3
• If the Python exception is an instance, then
framework looks for special attributes to fill
in COM extended exception information
– Just raise an instance with the right attributes – See exception.Exception utility class
Error Handling
3 of 3
• When called via IDispatch, it returns the
exception via EXCEP_INFO
• For non-Dispatch calls, the caller may
follow up by using ISupportErrorInfo to
retrieve the exception
Collections
1 of 3
• Collections are sequence-like objects that
typically implement the Add, Remove, and
Item methods and a Count property
• Some Collections (such as those provided
natively by VB) can be indexed using
numbers (acts as a sequence) or using strings
(acts as a mapping)
Collections
2 of 3
• The Item method is special
– It should be the “default” method, meaning that VB can implicitly call it without using its name
– The predefined DISPID_VALUE value refers to the default method
– Item can be called with one parameter (the index) or with two parameters (an index and a new value to place at that index)
– This duality is not handled well by the default policy nor server.util.Collection
Collections
3 of 3
• The Python COM framework defines returning
a list or tuple to mean returning a
SAFEARRAY of VARIANT values
– This means your object must explicitly return an object that obeys the Collection protocol
– A custom policy could be used to automatically wrap sequences with a Collection
– Only recognizes list and tuple
• avoids treating a string as a sequence
Enumerators
1 of 3
• Enumerators are used by clients to
enumerate a collection (sequence)
– VBScript automatically fetches an enumerator for script code such as:
for each item in collection
• Standard COM protocol uses the predefined
DISPID_NEWENUM value and calls
IDispatch::Invoke()
Enumerators
2 of 3
• IEnumVARIANT is the interface used by
Automation clients (such as VB)
– Your returned enumerator must implement the IEnumVARIANT interface
– Values returned from Next() are VARIANTs (see client section for discussion of COM enumerator interfaces)
– Support for IEnumVARIANT part of core – Python datatypes are easily coerced into
Enumerators
3 of 3
• win32com.server.util.NewEnum(seq) will
return an enumerator for a given sequence
• Custom enumerators are easily written:
– Dynamic sequences
– Special handling of enumerated values
– Subclass from server.util.ListEnumerator,
ListEnumeratorGateway, or write from scratch
• New gateways needed for interfaces other
Server Utilities
• Various functionality available in
win32com.server.*
– ...connect : connection points
– ...exception : exception handling – ...policy : framework support
win32com.server.connect
win32com.server.exception
• Exports a single class: Exception
• Constructor has keyword arguments for the
status code, description, help file, etc.
• The Exception class places these values into
instance variables
win32com.server.policy
• Framework knows about this file
– Hard-coded reference, so it must exist
• Provides CreateInstance for the framework
– Provides hooks for custom policies and dispatchers
• Defines various standard policies and
dispatchers
win32com.server.register
• Utilities for registering your COM servers
• Two primary functions:
– RegisterServer() – UnregisterServer()
• Typically, registration for servers in a file is
performed when the file is run from the
command line (e.g. “python myservers.py”)
win32com.server.register Example
class MyClass:
_public_methods_ = [ “MyMethod” ] # … class definition
if __name__ == “__main__”: import sys
from win32com.server import register
if len(sys.argv) > 1 and sys.argv[1] == “--unregister”: register.UnregisterServer(“{…}”, “The.ProgID”) else:
win32com.server.util
• wrap()
• unwrap()
• NewEnum()
– ListEnumerator class
– ListEnumeratorGateway class
win32com.makegw
• Tool for interfaces and gateways
– SWIG now my preference for interfaces - gateways somewhat harder
• Generate once, and never again
– compare with SWIG, which allows multiple
generations - particularly useful as support is added after initial generation
• Better than hand-coding
Advanced: Dispatchers
1 of 2
• Debugging and tracing utility
– Almost identical to policies; simply delegate to actual policy
• Only used during development, so zero
runtime overhead in release
• Implementation is not for speed, but for
assistance in debugging
Advanced: Dispatchers
2 of 2
• Log information about your server
– All method calls made – All IDispatch mapping
– All QueryInterface requests
• Dispatchers available that send to various debugging “terminals”
– win32dbg debugger, win32trace utility, existing stdout, etc.
Advanced: Wrapping
1 of 5
• All Python server objects are wrapped with
at least two objects: the policy and the
gateway
– Caveat: a custom policy may implement the actual server rather than using another object (see win32com.servers.dictionary)
Advanced: Wrapping
2 of 5
• Since a gateway is referenced with a C++
interface pointer, Python cannot hold the
reference
– Wrap once more with a framework “interface” (we’ll call it a PyInterface to distinguish from COM interfaces)
Advanced: Wrapping
3 of 5
Gateway Policy Server Interface P yt ho n C + +
Returned to COM client
Advanced: Wrapping
4 of 5
• win32com.pythoncom.WrapObject() wraps
a Python object with a gateway and a
PyInterface
– Pass it the policy object (which is wrapping the Python server object)
– Optional parameter specifies the IID of a
Advanced: Wrapping
5 of 5
• Unwrapping is performed through a special
COM interface: IUnwrapPythonObject
• pythoncom.UnwrapObject() queries for this
interface on the COM object held within the
PyInterface object that is passed
• The base gateway class implements this COM
interface (so all gateways have it)
Advanced: Custom Policies
1 of 7
• Why use a custom policy?
– Special error handling, creation, calling mechanisms, validation, etc
– Write the policy class and enter the appropriate information into the registry so that the
framework will use your policy
Advanced: Custom Policies
2 of 7
• Other provided policies
– BasicWrapPolicy : handy base class
– MappedWrapPolicy : low level mapping-based handling of names and properties and methods – DesignatedWrapPolicy : build onto the Mapped
policy a way for objects to easily specify the properties and methods
Advanced: Custom Policies
3 of 7
• Example: custom instantiation
– A single Python COM server was used to represent multiple COM objects
– At instantiation time, it used the CLSID passed to the policy to look in the registry for more
detailed information
– A child/sub object was created, based on the
Advanced: Custom Policies
4 of 7
• Example: error handling
– Returning “KeyError” or other Python exceptions to callers was undesirable
– Wrap all Invokes with an exception handler that would map Python errors into a generic error
– Let through pythoncom.com_error unchanged – If a “magic” registry value was present, then a
Advanced: Custom Policies
5 of 7
• Example: data validation
– If the server object had a _validation_map_ attribute, then a custom validation method would be called for all Property “puts”
– _validation_map_ would map a Property name to a type signature that the _validate_() method would test against
Advanced: Custom Policies
6 of 7
• Example: functions for Property get/put
– The Item() method in Collections is really treated as a parameterized property
– Using a custom policy, get_Item() can be differentiated from put_Item()
Advanced: Custom Policies
7 of 7
• Example: alter mechanism for specifying
the available properties
– Using the _validation_map_ from a previous example, the available properties are easily derived (simply the keys of the mapping)
– Avoided duplication of property specification (one in _validation_map_ and one in
Advanced: Threading
1 of 2
• Python is normally “single threaded”; the
least capable COM threading model
• With care, it could be possible to mark an
object as “free threaded” to fool how COM
handles the object, but Python will continue
to allow only one thread per process to run
• This behavior is fine for many applications
Advanced: Threading
2 of 2
• The problem can be reduced by applying
patches with allow Python to be truly
free-threaded
– Slows down single thread case
– Applies mainly to multiprocessor use
Future Directions
• Auto wrap and unwrap
• COM+
Future: Auto Wrapping
• This could be done today, but wasn’t:
– Leaving it to Python increased flexibility
– Complexity involved with needing a way to
specify two things during any wrapping process: the policy and the gateway
• Moving to COM+ will be an opportune time
to change
Future: COM+
1 of 3
• What is COM+ ?
– Upcoming revision of COM
– Runtime services: memory management,
interceptors, object model changes, language independence, etc
Future: COM+
2 of 3
• Python already has most of COM+’s
facilities and matches its model strongly
• Huge win for Python:
– Simplify COM programming even more – Will reduce the framework and associated
overheads
– Better language compatibilities
Future: COM+
3 of 3
• Any Python object can be a COM+ server,
provided it is registered appropriately
– Note that COM+ registration will be easier
• Auto wrapping
Future: SWIG
• Depends largely on what Dave Beazley is
willing to support!
• Interface support needs more work
– Framework is OK
– Mainly adding all interfaces and types to SWIG library
• Gateways still a long way off
– Future here quite uncertain
Future: makepy
• Functionally quite complete
– Some cleanup desirable, but works well
• Architectural issues outstanding
– Where does generated code go?
• Utilities for automatic generation
– From program ID
– Integration with COM browser